-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix mis-aligned prompts and completions in colocate mode #3491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks for the fix!
To enhance consistency, maybe using all_prompt_text
instead of prompt_text
in line 1060 would also work?
Do you mean that we may not need to slice prompts if we use all_prompts_text in generation (if TP > 1, do the gathering, else, + all_prompts_text = prompts_text
if self.vllm_tensor_parallel_size > 1:
# Gather prompts from all ranks in the TP group and flatten.
# Each rank starts with its own prompts; after gathering, all ranks see the full group set.
orig_size = len(prompts_text)
gathered_prompts = [None for _ in range(self.vllm_tensor_parallel_size)]
torch.distributed.all_gather_object(gathered_prompts, prompts_text, group=self.tp_group)
- prompts_text = [p for sublist in gathered_prompts for p in sublist]
+ all_prompts_text = [p for sublist in gathered_prompts for p in sublist]
with profiling_context(self, "vLLM.generate"):
- all_outputs = self.llm.generate(prompts_text, sampling_params=sampling_params, use_tqdm=False)
+ all_outputs = self.llm.generate(all_prompts_text, sampling_params=sampling_params, use_tqdm=False)
completion_ids = [output.token_ids for outputs in all_outputs for output in outputs.outputs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
Can you confirm it also works like this? |
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Fixes #3492
See the fixed log completions result:

Before submitting
Pull Request section?
to it if that's the case.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
CC @qgallouedec
Thank @fabianlim for catching this.