-
Notifications
You must be signed in to change notification settings - Fork 2k
Mention that InferenceClient
also works with local models + other t…
#1322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: célina <hanouticelina@gmail.com>
will wait for a review from a maintainer @albertvillanova @aymeric-roucher given this PR bumps a dependency version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Nice improvement in the documentation about model
and base_url
parameters.
However, if I understand correctly, the bump in the min version of "huggingface-hub" is intended to ensure that the default value of the provider
parameter is set to "auto" (instead of the previous default , "hf-inference"). But this is a breaking change.
Would it make sense to initiate a deprecation cycle instead? That way, we could warn users about the upcoming change and give them time to adapt, minimizing disruption for those with currently working code in production.
AFAIK the breaking change has already been introduced (true that there was no prior notice to it). Bumping the minimum version ensures that all users have the same behavior which is currently not the case depending on the huggingface_hub version they install. Regarding the "why" no prior notice, we decided to default to "auto" instead of "hf-inference" as we thought it was a minimal change for most users. Nothing breaks in the wild, just the data is processed by a different provider. Also it was quite needed with the recent rework on the HF Inference API which reduced the number of available models served by HF (no cold start models anymore). |
if anything it's a less-breaking-change i.e. it makes it work out of the box in more cases |
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback. I understand your point.
My concern wasn't so much about all users having identical behavior, but rather about avoiding a poor experience for even a single user.
The use case I had in mind (something similar happened to me) is: a user, already using smolagents in production, updates it to a new minor version to benefit from recent fixes or enhancements. After the update, their code suddenly breaks with an error due to a new, automatically assigned provider:
Error: Provider 'featherless-ai' not supported. Available values: 'auto' or any provider from ['black-forest-labs', 'cerebras', 'cohere', 'fal-ai', 'fireworks-ai', 'hf-inference', 'hyperbolic', 'nebius', 'novita', 'openai', 'replicate', 'sambanova', 'together'].Passing 'auto' (default value) will automatically select the first provider available for the model, sorted by the user's order in https://hf.co/settings/inference-providers.
I just wanted to ensure that users who trust the project enough to pin only minor versions don't suddenly hit unexpected errors that require them to dig into internal changes.
That said, if this potential disruption is acceptable from your point of view, I'm OK with merging as is.
Maybe it could also help to update the PR title to mention the change in default InferenceClientModel provider: that might help others quickly spot the root cause if they run into related issues.
- I will also add a comment about this to the Release notes to make it more visible to users.
…weaks
Other tweaks:
InferenceClient
is now "auto" (i.e. the user's favorite provider that's available for the model)