-
Notifications
You must be signed in to change notification settings - Fork 2k
Match multiline final answers in remote executors #1444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @albertvillanova !
Thank you @albertvillanova for checking on this. With your change, the regex-based method for final answer detection is extended to support multi-line arguments, solving #1428 for some cases. May I suggest however to also consider alternatives that are not based on regex/pattern matching. In my experience, using regex makes the detection mechanism less robust. Consider the following tasks and with the corresponding code snippet produced by the LLM: Task:
magic_num = magic_number_tool()
if magic_num == 5:
final_answer(3)
else:
final_answer(2) Problem: Task:
english_text = """The code agents in smolagents report the final answer to a task using the final_answer tool. For example, to
report that the final answer is 5, the agent may use the following code:
final_answer(5)"""
klingon_translation = to_klingon(english=english_text)
final_answer(klingon_translation) Problem:
In #1429 I suggested an alternative approach which is robust to the above examples. The local Python executor, which uses another detection mechanism, is also already robust to the above examples. Please consider. However you decide, thanks for the great work maintaining this project! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tobiasofsn good point, I indeed think that exception based mechanism is more robust! Only issue is that your current implementation would overwrite any custom FinalAnswerTool set by the user : indeed, better would be to wrap the forard method with a function that raises an exception at the end instead of replacing it.
Thanks for your review @aymeric-roucher: that is precisely why I had not merged this PR yet: I wanted to ensure there was not a strong reason why remote executors were using the regex approach, differently from the local executor. I think I could merge this PR as a hotfix (especially with the addition of the test for multi-line final answer), but without closing the alternative more robust PR: Then, I will review in detail the alternative PR so it is fully aligned and supports custom FinalAnswerTool implementations. |
Thank you @albertvillanova and @aymeric-roucher. Sounds like a good plan. Let me know if you want my help to make any changes. |
Removing my requested changes to let us merge this PR
Match multiline final answers in remote executors by using the regex DOTALL flag.
Fix #1428.
Related to: