-
Notifications
You must be signed in to change notification settings - Fork 118
Closed
Labels
UXRelated to user experienceRelated to user experienceinferenceInference RelatedInference RelatedtrainingTraining relatedTraining related
Description
Currently we don't support batch_size<dp_size
and batch_size % shards != 0
. We should relax this so that users do not have to recreate the vllm engine or the policy if they have an odd batch_size.
This leads to a UX where we have to tear down the worker groups each time we want a different batch size which isn't necessary.
Examples:
Metadata
Metadata
Assignees
Labels
UXRelated to user experienceRelated to user experienceinferenceInference RelatedInference RelatedtrainingTraining relatedTraining related