-
Notifications
You must be signed in to change notification settings - Fork 117
feat: add OpenAI format dataset for SFT #485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add OpenAI format dataset for SFT #485
Conversation
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Could we add some documentation on this class and how it differs from prompt_response_dataset.py
? Do you think it would make sense to consider merging this class with PromptResponseDataset
?
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
Hi @ashors1, added docstrings and unit tests.
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you for the contribution!
@AtsunoriFujita could you run pre-commit on your change? It fails our linter job |
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
Hi @terrykong, applied pre-commit. |
@AtsunoriFujita do you mind putting an example run command in the description so that users finding this PR can learn how to use this? |
@terrykong, thank you. I added it. |
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
Signed-off-by: Atsunori Fujita <afujita@nvidia.com>
What does this PR do ?
This PR enables using the OpenAI format dataset from a
json/jsonl
when running SFT.Issues
List issues that this PR closes (syntax):
Usage
Modify
examples/configs/sft.yaml
Run SFT job
Before your PR is "Ready for review"
Pre checks:
Additional Information