-
Notifications
You must be signed in to change notification settings - Fork 2k
Transfer aggregation of streaming events off the Model class #1449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"LiteLLMModel", | ||
"LiteLLMRouterModel", | ||
"OpenAIServerModel", | ||
"OpenAIModel", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a copy of the class with "Server" removed in the name for easier access
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -17,11 +17,6 @@ | |||
logger = getLogger(__name__) | |||
|
|||
|
|||
class Message(TypedDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albertvillanova since Message and ChatMessage were mostly interchangeable, I fuse them.
One potential difficulty to consider is that the class is not a TypedDict anymore, so cannot be considered a dict.
But this didn't really create any implementation problem so far: we just handle ChatMessage objects internally, and can handle the dict conversion in Model
subclasses just before sending messages to inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... I though the purpose of to_messages
was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).
2f33e4c
to
8c8ebaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good refactoring!
@@ -66,7 +66,7 @@ print(model([{"role": "user", "content": [{"type": "text", "text": "Ok!"}]}], st | |||
|
|||
### InferenceClientModel | |||
|
|||
The `HfApiModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more. | |||
The `InferenceClientModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I guess this was missed in:
class HfApiModel(InferenceClientModel): | ||
def __new__(cls, *args, **kwargs): | ||
warnings.warn( | ||
"HfApiModel was renamed to InferenceClientModel in version 1.14.0 and will be removed in 1.17.0.", | ||
FutureWarning, | ||
) | ||
return super().__new__(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this was in my TODO list for the next release.
I kept it in the last release because, although the class was renamed in v1.14.0 via
I only added the deprecation warning later in v1.16.0 via
So I wanted to allow a bit more time for users to update their code before removing it entirely.
src/smolagents/models.py
Outdated
class OpenAIModel(OpenAIServerModel): | ||
def __new__(cls, *args, **kwargs): | ||
return super().__new__(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just an alias or are you planning to deprecate OpenAIServerModel
?
If this is just an alias and both are identically valid, then I would suggest:
OpenAIModel = OpenAIServerModel
If you are planning to deprecate OpenAIServerModel
, then you should inherit inversely:
class OpenAIServerModel(OpenAIModel):
def __new__(cls, *args, **kwargs):
warnings.warn(
"OpenAIServerModel was renamed to OpenAIModel in version 1.19.0 and will be removed in 1.22.0. "
"Please use OpenAIModel instead.",
FutureWarning,
stacklevel=2
)
return super().__new__(cls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an alias : so I'll just copy it !
@@ -17,11 +17,6 @@ | |||
logger = getLogger(__name__) | |||
|
|||
|
|||
class Message(TypedDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... I though the purpose of to_messages
was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).
Thank you for your comments! So if we think again of the distinction |
This PR change the logic of streaming messages:
The reason for this is that front-ends like copilotkit generally expect individual streaming deltas.
Additionally, it removes the
HfApiModel
class, which was deprecated and due for deletion in 1.17, and fusesMessage
intoChatMessage