Transfer aggregation of streaming events off the Model class #1449

aymeric-roucher · 2025-06-17T23:06:59Z

This PR change the logic of streaming messages:

Previously, the logic was to accumulate streaming deltas in the Model classes, and yield objects that "contain all the generated text since start of streaming until now"
This PR makes the ModelClass directly return atomic streaming deltas, to handle the aggregation only within the agent.
The reason for this is that front-ends like copilotkit generally expect individual streaming deltas.

Additionally, it removes the HfApiModel class, which was deprecated and due for deletion in 1.17, and fuses Message into ChatMessage

aymeric-roucher · 2025-06-19T14:35:24Z

src/smolagents/models.py

    "LiteLLMModel",
    "LiteLLMRouterModel",
    "OpenAIServerModel",
+    "OpenAIModel",


Adding a copy of the class with "Server" removed in the name for easier access

HuggingFaceDocBuilderDev · 2025-06-19T14:45:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

aymeric-roucher · 2025-06-19T14:51:53Z

src/smolagents/memory.py

@@ -17,11 +17,6 @@
 logger = getLogger(__name__)


-class Message(TypedDict):


@albertvillanova since Message and ChatMessage were mostly interchangeable, I fuse them.
One potential difficulty to consider is that the class is not a TypedDict anymore, so cannot be considered a dict.
But this didn't really create any implementation problem so far: we just handle ChatMessage objects internally, and can handle the dict conversion in Model subclasses just before sending messages to inference.

OK... I though the purpose of to_messages was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).

albertvillanova

Thanks, good refactoring!

albertvillanova · 2025-06-20T06:45:39Z

docs/source/en/reference/models.mdx

@@ -66,7 +66,7 @@ print(model([{"role": "user", "content": [{"type": "text", "text": "Ok!"}]}], st

 ### InferenceClientModel

-The `HfApiModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.
+The `InferenceClientModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.



Good catch! I guess this was missed in:

rename HfApiModel to InferenceClientModel #1198

albertvillanova · 2025-06-20T06:54:33Z

src/smolagents/models.py

-class HfApiModel(InferenceClientModel):
-    def __new__(cls, *args, **kwargs):
-        warnings.warn(
-            "HfApiModel was renamed to InferenceClientModel in version 1.14.0 and will be removed in 1.17.0.",
-            FutureWarning,
-        )
-        return super().__new__(cls)


Indeed, this was in my TODO list for the next release.

I kept it in the last release because, although the class was renamed in v1.14.0 via

rename HfApiModel to InferenceClientModel #1198

I only added the deprecation warning later in v1.16.0 via

Fix deprecation of HfApiModel #1315

So I wanted to allow a bit more time for users to update their code before removing it entirely.

albertvillanova · 2025-06-20T07:08:44Z

src/smolagents/models.py

+class OpenAIModel(OpenAIServerModel):
+    def __new__(cls, *args, **kwargs):
+        return super().__new__(cls)


Is this just an alias or are you planning to deprecate OpenAIServerModel?

If this is just an alias and both are identically valid, then I would suggest:

OpenAIModel = OpenAIServerModel

If you are planning to deprecate OpenAIServerModel, then you should inherit inversely:

class OpenAIServerModel(OpenAIModel): def __new__(cls, *args, **kwargs): warnings.warn( "OpenAIServerModel was renamed to OpenAIModel in version 1.19.0 and will be removed in 1.22.0. " "Please use OpenAIModel instead.", FutureWarning, stacklevel=2 ) return super().__new__(cls)

It's an alias : so I'll just copy it !

albertvillanova · 2025-06-20T07:24:02Z

src/smolagents/memory.py

@@ -17,11 +17,6 @@
 logger = getLogger(__name__)


-class Message(TypedDict):


OK... I though the purpose of to_messages was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).

aymeric-roucher · 2025-06-20T09:27:16Z

Thank you for your comments! So if we think again of the distinction ChatMessage, and Message, Message is just a dict-like version of ChatMessage, it's a bit like a less complete and dict-converted version of ChatMessage, thus the fusion of the two.
to_messages is a way to convert memory steps to chat messages, it's not particularly expected for these messages to already be dictionaries.

aymeric-roucher added 18 commits June 17, 2025 16:06

Transfer aggregation of streaming events off the Model class

da5d456

Basic gradio ui with openai

0c08dd6

Working streaming gradio UI openaimodel

0a0456c

Working models in ToolCallingAgent

201189d

Format

f9f7ded

Fix missing definition for removed HfApiModel

9c68124

Add

b4b8f8a

Merge branch 'main' into richer-streaming-events

99dcf01

Ruff

c1a24ff

Merge branch 'main' into richer-streaming-events

dd2dc02

Fix variable name

d0925ae

Fuse yielding tool call loops

3860b22

Separate ActionOutput and ToolOutput

80eb8c0

ChatMessageToolCallStreamDelta

f5df2c6

Use ChatMessage in get_clean_message_list

86dcf04

Remove HfApiModel

d366832

Format

077eae5

Reset gradio example

0b2d836

aymeric-roucher commented Jun 19, 2025

View reviewed changes

Replace Message with ChatMessage

aa556de

Replace names in tests

93d651e

aymeric-roucher commented Jun 19, 2025

View reviewed changes

aymeric-roucher added 7 commits June 19, 2025 16:52

Format

844c526

Fix one lint error

c99cb6d

Fix model tests

5706af9

Fix some tests

9d63317

Fix some more tests

920059c

Fix more tests

8b1cffb

Revert flattening of plan messages

d93e909

aymeric-roucher added 4 commits June 19, 2025 17:54

Fix test

de38b29

Fix more tests

3739174

Add final conversion of ChatMessage to dict

20b7e16

Fix tests

8c8ebaa

aymeric-roucher force-pushed the richer-streaming-events branch from 2f33e4c to 8c8ebaa Compare June 19, 2025 16:16

aymeric-roucher added 2 commits June 19, 2025 18:24

Pass last tests

65ce7df

Pass last test

05e00a0

aymeric-roucher marked this pull request as ready for review June 19, 2025 16:30

aymeric-roucher requested a review from albertvillanova June 19, 2025 16:31

albertvillanova approved these changes Jun 20, 2025

View reviewed changes

aymeric-roucher added 2 commits June 20, 2025 11:22

Use aliases for model names

f765657

Merge branch 'main' into richer-streaming-events

8065690

aymeric-roucher added 2 commits June 20, 2025 11:28

Pass test

c45b6ed

Remove structured outputs from example

7b278da

aymeric-roucher merged commit e7c9ba2 into main Jun 20, 2025
5 checks passed

albertvillanova linked an issue Jun 30, 2025 that may be closed by this pull request

'NoneType' object has no attribute 'input_tokens' #1466

Closed

This was referenced Jun 30, 2025

'NoneType' object has no attribute 'input_tokens' #1466

Closed

Fix incorrect token counting in streaming TransformersModel #1503

Merged

This was referenced Jul 8, 2025

[BUG] ChatMessage content Attribute Accessed Incorrectly #1532

Closed

Fix Access of content Field in ChatMessage Object #1533

Merged

Fix message attribute access after replacing dict with dataclass #1534

Merged

albertvillanova mentioned this pull request Jul 16, 2025

BUG: Unable to get example code working #1565

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transfer aggregation of streaming events off the Model class #1449

Transfer aggregation of streaming events off the Model class #1449

Uh oh!

aymeric-roucher commented Jun 17, 2025 •

edited

Loading

Uh oh!

aymeric-roucher Jun 19, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 19, 2025

Uh oh!

aymeric-roucher Jun 19, 2025

Uh oh!

albertvillanova Jun 20, 2025

Uh oh!

albertvillanova left a comment

Uh oh!

albertvillanova Jun 20, 2025

Uh oh!

albertvillanova Jun 20, 2025

Uh oh!

albertvillanova Jun 20, 2025

Uh oh!

aymeric-roucher Jun 20, 2025

Uh oh!

albertvillanova Jun 20, 2025

Uh oh!

aymeric-roucher commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

		@@ -17,11 +17,6 @@
		logger = getLogger(__name__)


		class Message(TypedDict):

Transfer aggregation of streaming events off the Model class #1449

Transfer aggregation of streaming events off the Model class #1449

Uh oh!

Conversation

aymeric-roucher commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aymeric-roucher commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

aymeric-roucher commented Jun 17, 2025 •

edited

Loading