Skip to content

Conversation

aymeric-roucher
Copy link
Collaborator

@aymeric-roucher aymeric-roucher commented Jul 2, 2025

gradio_client.Client.predict() returns path to generated images/audio files instead of the real image/audio object: this PR fixes it. It also adds testing to make sure that agents do return image objects when they call final_answer on an image.

@aymeric-roucher aymeric-roucher requested a review from A-Mahla July 2, 2025 14:07
@aymeric-roucher
Copy link
Collaborator Author

aymeric-roucher commented Jul 2, 2025

Requesting review from @A-Mahla since we have been discussing how to integrate different tools!

(failing wikipedia tests are unrelated)

Comment on lines -645 to 657
return output[
if isinstance(output[1], str):
raise ValueError("The space returned this message: " + output[1])
output = output[
0
] # Sometime the space also returns the generation seed, in which case the result is at index 0
IMAGE_EXTENTIONS = [".png", ".jpg", ".jpeg", ".gif", ".webp"]
AUDIO_EXTENTIONS = [".mp3", ".wav", ".ogg", ".m4a", ".flac"]
if isinstance(output, str) and any([output.endswith(ext) for ext in IMAGE_EXTENTIONS]):
output = AgentImage(output)
elif isinstance(output, str) and any([output.endswith(ext) for ext in AUDIO_EXTENTIONS]):
output = AgentAudio(output)
return output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not clear what type output is exactly. AgentImage | AgentAudio | str ?

@aymeric-roucher aymeric-roucher merged commit d832f6d into main Jul 8, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants