Skip to content

Conversation

cdoern
Copy link
Contributor

@cdoern cdoern commented Apr 25, 2024

both generate and serve already had a flag for the context size, meaning the user can override the default of 4096 in those two scenarios.

Added a new flag called model-family to serve which currently supports merlinite or mixtral. This is then used to load the proper chat template from a hardcoded map of supported templates. If nothing is specified, we default to the current one hardcoded in instructlab

Added the same flag to generate, since the generate prompt template needs extra padding for mixtral

Also, add model-name to ilab train as linux train had a hardcoded merlinite reference for the model name.

Most other "hardcoded" references are really just defaults which can be overridden by the user.
#973 speaks more about semantic fixes between serve and chat using the wrong model paths/names but is also meant to speak to hardcoded references to merlinite. #29 is the tracker for actually making sure serve uses the proper template.

resolves #973
resolves #29

@cdoern cdoern force-pushed the model-agnostic branch 2 times, most recently from 341e50d to a32921a Compare April 25, 2024 18:50
@nathan-weinberg nathan-weinberg requested a review from a team April 25, 2024 18:51
@cdoern cdoern force-pushed the model-agnostic branch 5 times, most recently from 55d0c0b to 5072867 Compare April 26, 2024 13:17
@oindrillac oindrillac marked this pull request as draft April 26, 2024 14:24
@cdoern cdoern force-pushed the model-agnostic branch 2 times, most recently from 215a6bd to 07c6b87 Compare April 27, 2024 15:44
@cdoern cdoern marked this pull request as ready for review April 27, 2024 21:20
@github-actions github-actions bot added the testing Relates to testing label Apr 27, 2024
@cdoern cdoern removed the testing Relates to testing label Apr 29, 2024
@github-actions github-actions bot added the testing Relates to testing label Apr 29, 2024
@cdoern cdoern force-pushed the model-agnostic branch 2 times, most recently from a68c55f to eac2023 Compare April 29, 2024 14:16
@nathan-weinberg nathan-weinberg self-requested a review April 29, 2024 14:37
@mairin
Copy link
Member

mairin commented Apr 29, 2024

Two workflows to test, need to run an end-to-end test....

  1. run this with a granite model, ensure that it correctly loads the merlinite template and works correctly
  2. rnu this with a mixtral model, ensure correctly loads mixtral template and works correctly

"--model-name",
default="instructlab/merlinite-7b-lab",
show_default=True,
help="model name to use in training",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chore: Might want to change this to something like "huggingface path to use in training" just so users don't only put only merlinite-7b-lab instead of instructlab/merlinite-7b-lab.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extrapolate any of this data from other args? If not let's update the help text so it's less confusing for users

Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaideepr97 @cdoern could you help clarify the intention of the PR with the changes to the title/descriptions you folks mentioned in the call please, it'll help with the review process, especially grokking failing test cases.

@@ -302,6 +337,7 @@ def generate_data(
logger,
api_base,
tls_insecure,
model_family: str,
Copy link
Contributor

@anik120 anik120 Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to introduce a new model_family? Can we not interpret the model family from model_name?
Name: merlinite-4b-asdf.gguf. Family: merlinite
Name: granite-4b-asdf.gguf. Family: merlinite
Name: mixtral-4b-asdf.gguf. Family: mixtral

This solves for what @mairin commented too, without any additional code changes.

Two workflows to test, need to run an end-to-end test....

  • run this with a granite model, ensure that it correctly loads the merlinite template and works correctly
  • rnu this with a mixtral model, ensure correctly loads mixtral template and works correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a resolution to this? I'm curious about this point as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah I was told ignore for this PR it'll be a follow up 🤷🏽

@anik120 anik120 dismissed their stale review April 29, 2024 16:29

Looks like @russel has more context and can help move this forward

@jaideepr97 jaideepr97 changed the title Make ilab model agnostic Support serve + chat + generate with Mixtral teacher model Apr 29, 2024
@jaideepr97
Copy link
Member

jaideepr97 commented Apr 29, 2024

@anik120 updated the PR title

The changes included in this PR do the following:

  • changes to serve so that mixtral chat template is passed to the model server
  • changes to generate so that mixtral generation template is passed to the model server
  • 1 line change to train so that any model other than merlinite can be trained on linux

@cdoern cdoern force-pushed the model-agnostic branch 3 times, most recently from 2cf4be6 to de6fec2 Compare April 29, 2024 17:53
@@ -302,6 +337,7 @@ def generate_data(
logger,
api_base,
tls_insecure,
model_family: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a resolution to this? I'm curious about this point as well

@click.option(
"--model-family",
default="merlinite",
help="model family to use when picking a generation template",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="model family to use when picking a generation template",
help="Model family generation template to serve with, e.g. 'merlinite', 'granite', etc.",

"--model-name",
default="instructlab/merlinite-7b-lab",
show_default=True,
help="model name to use in training",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extrapolate any of this data from other args? If not let's update the help text so it's less confusing for users

both generate and serve already had a flag for the context size, meaning the user can override the default of 4096 in those two scenarios.

Added a new flag called `model-family` to serve which currently supports `merlinite` or `mixtral`. This is then used to load the proper chat template from a hardcoded map of supported templates. If nothing is specified, we default to the current one hardcoded in instructlab. any function that can start a server needs this flag as well. This meaans chat and generate need this flag.

Added the same flag to generate, since the generate prompt template needs extra padding for mixtral

Also, add `model-name` to `ilab train` as linux train had a hardcoded merlinite reference for the model name.

Most other "hardcoded" references are really just defaults which can be overridden by the user.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good and we can continue with follow-up PRs to improve UX, help text, and so forth - I'd like to go ahead and merge since some others are waiting for the functional part to be merged

Copy link
Member

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments will be addressed in followup

@mergify mergify bot merged commit 03dc971 into instructlab:main Apr 29, 2024
@bbrowning
Copy link
Contributor

Should we be using the chat templates as specified in the tokenizer_config.json of each model from HuggingFace instead of hardcoding them? Essentially https://huggingface.co/docs/transformers/main/en/chat_templating#how-do-i-use-chat-templates

This would get us out of having to hardcode the templates at all and instead we'll just use what the model's metadata says it requires. This would mean expecting models to maintain appropriate chat templates, but I believe that's a pretty common practice.

@russellb
Copy link
Member

Definitely sounds like a good follow-up issue to look at! Thanks, @bbrowning !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Relates to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ilab serve and config (init) code should be model-dependent chat template is hard-coded