Skip to content

Mention that InferenceClient also works with local models + other t… #1322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 15, 2025

Conversation

julien-c
Copy link
Member

…weaks

Other tweaks:

  • Mention that the default provider for InferenceClient is now "auto" (i.e. the user's favorite provider that's available for the model)

Co-authored-by: célina <hanouticelina@gmail.com>
@julien-c
Copy link
Member Author

will wait for a review from a maintainer @albertvillanova @aymeric-roucher given this PR bumps a dependency version

Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Nice improvement in the documentation about model and base_url parameters.

However, if I understand correctly, the bump in the min version of "huggingface-hub" is intended to ensure that the default value of the provider parameter is set to "auto" (instead of the previous default , "hf-inference"). But this is a breaking change.

Would it make sense to initiate a deprecation cycle instead? That way, we could warn users about the upcoming change and give them time to adapt, minimizing disruption for those with currently working code in production.

@Wauplin
Copy link
Contributor

Wauplin commented May 14, 2025

AFAIK the breaking change has already been introduced (true that there was no prior notice to it). Bumping the minimum version ensures that all users have the same behavior which is currently not the case depending on the huggingface_hub version they install.

Regarding the "why" no prior notice, we decided to default to "auto" instead of "hf-inference" as we thought it was a minimal change for most users. Nothing breaks in the wild, just the data is processed by a different provider. Also it was quite needed with the recent rework on the HF Inference API which reduced the number of available models served by HF (no cold start models anymore).

@julien-c
Copy link
Member Author

if anything it's a less-breaking-change i.e. it makes it work out of the box in more cases

Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
@julien-c julien-c requested a review from albertvillanova May 14, 2025 17:04
Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback. I understand your point.

My concern wasn't so much about all users having identical behavior, but rather about avoiding a poor experience for even a single user.

The use case I had in mind (something similar happened to me) is: a user, already using smolagents in production, updates it to a new minor version to benefit from recent fixes or enhancements. After the update, their code suddenly breaks with an error due to a new, automatically assigned provider:

Error: Provider 'featherless-ai' not supported. Available values: 'auto' or any provider from ['black-forest-labs', 'cerebras', 'cohere', 'fal-ai', 'fireworks-ai', 'hf-inference', 'hyperbolic', 'nebius', 'novita', 'openai', 'replicate', 'sambanova', 'together'].Passing 'auto' (default value) will automatically select the first provider available for the model, sorted by the user's order in https://hf.co/settings/inference-providers.

I just wanted to ensure that users who trust the project enough to pin only minor versions don't suddenly hit unexpected errors that require them to dig into internal changes.

That said, if this potential disruption is acceptable from your point of view, I'm OK with merging as is.

Maybe it could also help to update the PR title to mention the change in default InferenceClientModel provider: that might help others quickly spot the root cause if they run into related issues.

  • I will also add a comment about this to the Release notes to make it more visible to users.

@julien-c julien-c merged commit f11a04e into main May 15, 2025
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants