-
Notifications
You must be signed in to change notification settings - Fork 2k
Agent.run() can return RunResult object #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
|
||
# Choose which inference type to use! | ||
|
||
available_inferences = ["hf_api", "hf_api_provider", "transformers", "ollama", "litellm", "openai"] | ||
chosen_inference = "hf_api_provider" | ||
available_inferences = ["inference_client", "transformers", "ollama", "litellm", "openai"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aligns this with the new name since #1198.
elif chosen_inference == "hf_api_provider": | ||
model = InferenceClientModel(provider="together") | ||
if chosen_inference == "inference_client": | ||
model = InferenceClientModel(model_id="meta-llama/Llama-3.3-70B-Instruct", provider="nebius") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify provider "nebius" since they don't error out when using tool_call="required"
src/smolagents/agents.py
Outdated
"""Holds extended information about an agent run.""" | ||
|
||
result: Any | ||
token_usage: TokenUsage | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be None in case the token usage cannot be obtained.
) | ||
except Exception as e: | ||
raise e | ||
msg = msg.replace("<", r"\<").replace(">", r"\>") # HTML tags seem to break Gradio Chatbot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albertvillanova escaping HTML tags fixe a hairy bug where HTML tagged messages wouldn't appear in the Gradio Chatbot. @yvrjsharma do you know why this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this should be fixed in Gradio if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reporting this, I have tagged you both in a related issue on gradio repo.
src/smolagents/models.py
Outdated
@@ -1226,7 +1225,7 @@ class InferenceClientModel(ApiModel): | |||
|
|||
def __init__( | |||
self, | |||
model_id: str = "Qwen/Qwen2.5-Coder-32B-Instruct", | |||
model_id: str = "Qwen/Qwen3-32B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the default to the current best Qwen model!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may raise an error for some providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, reverting this and leaving it for later.
@albertvillanova could you take a look, then if the UI appears good to you I'll add tests! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Just some comments/questions below. And as you suggested, more tests to be added.
) | ||
except Exception as e: | ||
raise e | ||
msg = msg.replace("<", r"\<").replace(">", r"\>") # HTML tags seem to break Gradio Chatbot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this should be fixed in Gradio if possible.
self.last_input_token_count: int | None = None | ||
self.last_output_token_count: int | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new approach, but this is a breaking change: should we deprecate these parameters?
src/smolagents/models.py
Outdated
@@ -1226,7 +1225,7 @@ class InferenceClientModel(ApiModel): | |||
|
|||
def __init__( | |||
self, | |||
model_id: str = "Qwen/Qwen2.5-Coder-32B-Instruct", | |||
model_id: str = "Qwen/Qwen3-32B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may raise an error for some providers?
src/smolagents/agents.py
Outdated
token_usage: TokenUsage | None | ||
messages: list[dict] | ||
timing: Timing | ||
state: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you describe the meaining of all attributes in the docstring? For example, we should describe what values can have state
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
self.model_id: str | None = model_id | ||
|
||
@property | ||
def last_input_token_count(self) -> int | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albertvillanova WDYT of this implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you should emit a warning instead of logging one, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Just some comments.
src/smolagents/models.py
Outdated
@@ -1309,17 +1339,24 @@ def generate_stream( | |||
for event in self.client.chat.completions.create( | |||
**completion_kwargs, stream=True, stream_options={"include_usage": True} | |||
): | |||
if getattr(event, "usage", None): | |||
print("EV:", event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you forgot this print
src/smolagents/models.py
Outdated
if getattr(event, "usage", None): | ||
self.last_input_token_count = event.usage.prompt_tokens | ||
self.last_output_token_count = event.usage.completion_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You forgot these?
if getattr(event, "usage", None): | ||
self.last_input_token_count = event.usage.prompt_tokens | ||
self.last_output_token_count = event.usage.completion_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And these?
def get_total_token_counts(self): | ||
return { | ||
"input": self.total_input_token_count, | ||
"output": self.total_output_token_count, | ||
} | ||
def get_total_token_counts(self) -> TokenUsage: | ||
return TokenUsage( | ||
input_tokens=self.total_input_token_count, | ||
output_tokens=self.total_output_token_count, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For backward compatibility, maybe better:
- keep
get_total_token_counts
as it was implemented before, but raising a deprecation warning - implement a new
get_token_usage
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be more robust to not only keep the method get_total_token_counts, but also the underlying attributes self.total_input_token_count and self.total_output_token_counts, because people might be accessing these directly!
self.model_id: str | None = model_id | ||
|
||
@property | ||
def last_input_token_count(self) -> int | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you should emit a warning instead of logging one, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized this.
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
@albertvillanova about your suggestion with defining duration through |
…__`. After huggingface/smolagents#1337 we need to instrument all exported subclasses instead of the parent class. - bump "smolagents[mcp]>=1.17.0".
#350) * fix(smolagents): Instrument `Model.generate` instead of `Model.__call__`. After huggingface/smolagents#1337 we need to instrument all exported subclasses instead of the parent class. - bump "smolagents[mcp]>=1.17.0". * Fix uninstrument
No description provided.