Skip to content

[BUG] add_generation_prompt=False when it should be True? #993

@ofirzaf

Description

@ofirzaf

Describe the bug
CodeAgent and planning with ToolCallingAgent are not passing add_generation_template=True when applying the chat template to the input messages of the model.

Note

Not sure if it is a bug or this is how you intended for it to work, but from what I saw it confuses the model and just seems wrong. If this is intended, can you please explain the logic?

Note the <|endoftext|> token at the end of the prompt.

# End of CodeAgent prompt
<|system|> ...
Now Begin! If you solve the task correctly, you will receive a reward of $1,000,000.<|end|><|user|>New task:
Could you get me the title of the page at url "https://huggingface.co/blog"?<|end|><|endoftext|>
# End of planning step initial_facts prompt
<|user|> ...
Now Begin!<|end|><|endoftext|>

Not sure if it is a bug or this is how you intended for it to work, but from what I saw it confuses the model and just seems wrong.
The problem is that when calling the model, the condition to add the generation prompt is:

prompt_tensor = self.tokenizer.apply_chat_template(
    ...
    add_generation_prompt=True if tools_to_call_from else False,
)

And the tools_to_call_from are not passed to the call when using CodeAgent or when planning.

Code to reproduce the error

from smolagents import ToolCallingAgent, TransformersModel, CodeAgent

import torch


if __name__ == '__main__':
    model_id = 'microsoft/Phi-4-mini-instruct'
    model = TransformersModel(model_id=model_id, torch_dtype=torch.float16, device_map='cuda')
    agent = CodeAgent(tools=[], model=model, add_base_tools=True)
    agent.run('Could you get me the title of the page at url "https://huggingface.co/blog"?', reset=True)

Expected behavior
add_generation_prompt=True by default for all model calls?

Packages version:
smolagents==1.11.0

@albertvillanova saw you were involved in PRs around this issue, maybe you can shed some light on this matter. Thanks!

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions