-
Notifications
You must be signed in to change notification settings - Fork 870
Description
Describe the bug
We wanted to pass custom limits to the httpx client, so that the number of connections and max keepalive connections are changed. As bentoml does not support this, we subclasses httpx.AsyncClient
and set the limits there. We apply the limits by overriding client_cls, but this does not work.
Bentoml uses client_cls is httpx.Client
and client_cls is httpx.AsyncClient
several times, but MyHttpxClient is not httpx.AsyncClient
although it inherits from it. If bentoml would use issubclass
, then everything would be fine.
Please not that this is not directly a feature request for custom limits directly (that would be also great!), but rather a request for extensibility of the bentoml client.
To reproduce
class MyHttpxClient(httpx.AsyncClient): ...
class MyClient(bentoml.AsyncHTTPClient):
client_cls = MyHttpxClient
async def test_it() -> None:
async with MyClient(...) as client:
await client.predict(...)
This fails with RuntimeError: Attempted to send an sync request with an AsyncClient instance.
because of
BentoML/src/_bentoml_impl/client/http.py
Lines 226 to 233 in a567a3a
return self.client.build_request( | |
"POST", | |
endpoint.route, | |
headers=headers, | |
content=to_async_iterable(payload.data) | |
if self.client_cls is httpx.AsyncClient | |
else payload.data, | |
) |
(recall that
self.client_cls is not httpx.AsyncClient
)
Expected behavior
Bentoml uses the overridden client class and it works.
Environment
bentoml: a567a3a
python: 3.12
OS: Arch Linux