-
Notifications
You must be signed in to change notification settings - Fork 441
Support serve + chat + generate with Mixtral teacher model #1002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
341e50d
to
a32921a
Compare
55d0c0b
to
5072867
Compare
215a6bd
to
07c6b87
Compare
a68c55f
to
eac2023
Compare
Two workflows to test, need to run an end-to-end test....
|
"--model-name", | ||
default="instructlab/merlinite-7b-lab", | ||
show_default=True, | ||
help="model name to use in training", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chore: Might want to change this to something like "huggingface path to use in training" just so users don't only put only merlinite-7b-lab
instead of instructlab/merlinite-7b-lab
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extrapolate any of this data from other args? If not let's update the help text so it's less confusing for users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaideepr97 @cdoern could you help clarify the intention of the PR with the changes to the title/descriptions you folks mentioned in the call please, it'll help with the review process, especially grokking failing test cases.
@@ -302,6 +337,7 @@ def generate_data( | |||
logger, | |||
api_base, | |||
tls_insecure, | |||
model_family: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to introduce a new model_family
? Can we not interpret the model family from model_name
?
Name: merlinite-4b-asdf.gguf
. Family: merlinite
Name: granite-4b-asdf.gguf
. Family: merlinite
Name: mixtral-4b-asdf.gguf
. Family: mixtral
This solves for what @mairin commented too, without any additional code changes.
Two workflows to test, need to run an end-to-end test....
- run this with a granite model, ensure that it correctly loads the merlinite template and works correctly
- rnu this with a mixtral model, ensure correctly loads mixtral template and works correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a resolution to this? I'm curious about this point as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah I was told ignore for this PR it'll be a follow up 🤷🏽
Looks like @russel has more context and can help move this forward
ilab
model agnostic
@anik120 updated the PR title The changes included in this PR do the following:
|
2cf4be6
to
de6fec2
Compare
@@ -302,6 +337,7 @@ def generate_data( | |||
logger, | |||
api_base, | |||
tls_insecure, | |||
model_family: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a resolution to this? I'm curious about this point as well
@click.option( | ||
"--model-family", | ||
default="merlinite", | ||
help="model family to use when picking a generation template", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help="model family to use when picking a generation template", | |
help="Model family generation template to serve with, e.g. 'merlinite', 'granite', etc.", |
"--model-name", | ||
default="instructlab/merlinite-7b-lab", | ||
show_default=True, | ||
help="model name to use in training", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extrapolate any of this data from other args? If not let's update the help text so it's less confusing for users
both generate and serve already had a flag for the context size, meaning the user can override the default of 4096 in those two scenarios. Added a new flag called `model-family` to serve which currently supports `merlinite` or `mixtral`. This is then used to load the proper chat template from a hardcoded map of supported templates. If nothing is specified, we default to the current one hardcoded in instructlab. any function that can start a server needs this flag as well. This meaans chat and generate need this flag. Added the same flag to generate, since the generate prompt template needs extra padding for mixtral Also, add `model-name` to `ilab train` as linux train had a hardcoded merlinite reference for the model name. Most other "hardcoded" references are really just defaults which can be overridden by the user. Signed-off-by: Charlie Doern <cdoern@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good and we can continue with follow-up PRs to improve UX, help text, and so forth - I'd like to go ahead and merge since some others are waiting for the functional part to be merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments will be addressed in followup
Should we be using the chat templates as specified in the This would get us out of having to hardcode the templates at all and instead we'll just use what the model's metadata says it requires. This would mean expecting models to maintain appropriate chat templates, but I believe that's a pretty common practice. |
Definitely sounds like a good follow-up issue to look at! Thanks, @bbrowning ! |
both generate and serve already had a flag for the context size, meaning the user can override the default of 4096 in those two scenarios.
Added a new flag called
model-family
to serve which currently supportsmerlinite
ormixtral
. This is then used to load the proper chat template from a hardcoded map of supported templates. If nothing is specified, we default to the current one hardcoded in instructlabAdded the same flag to generate, since the generate prompt template needs extra padding for mixtral
Also, add
model-name
toilab train
as linux train had a hardcoded merlinite reference for the model name.Most other "hardcoded" references are really just defaults which can be overridden by the user.
#973 speaks more about semantic fixes between serve and chat using the wrong model paths/names but is also meant to speak to hardcoded references to merlinite. #29 is the tracker for actually making sure serve uses the proper template.
resolves #973
resolves #29