Skip to content

Conversation

Receiling
Copy link
Contributor

@Receiling Receiling commented Jul 16, 2025

Currently, embedding api has supported prompt truncation, while the encode function of async engine (both V1 and V0 engine) does not support long prompt truncation. This PR adds tokenization_kwargs to encode function of both V1 and V0 engine, allowing direct use of engine.encode to truncate inputs for embedding models.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the tokenization_kwargs parameter to the encode methods in both the V0 and V1 async LLM engines. This is a valuable addition that enables passing tokenizer-specific arguments, like truncation settings, directly to the encode function, which is particularly useful for embedding models.

I've reviewed the changes in vllm/engine/async_llm_engine.py and vllm/v1/engine/async_llm.py. The implementation correctly adds the new parameter and passes it through the respective call stacks to the input preprocessor. The changes are consistent, well-scoped to the stated objective, and I did not find any issues of high or critical severity.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@Receiling Receiling requested a review from aarnphm as a code owner July 22, 2025 06:49
@mergify mergify bot added the frontend label Jul 22, 2025
Copy link

mergify bot commented Jul 22, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Receiling.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jul 22, 2025
@Receiling Receiling force-pushed the add_tokenization_kwargs_for_v0_engine branch from 5c6f52d to 12a3fc7 Compare July 22, 2025 07:33
@mergify mergify bot removed the needs-rebase label Jul 22, 2025
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after you address last comment, thanks for bearing with me!

@DarkLight1337
Copy link
Member

Can you fix pre-commit?

@Receiling
Copy link
Contributor Author

emmm, I’m a bit confused, which one are you referring to?

@DarkLight1337
Copy link
Member

@Receiling
Copy link
Contributor Author

Oh, got it, I try to fix it.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) July 22, 2025 12:00
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 22, 2025
auto-merge was automatically disabled July 22, 2025 12:33

Pull Request is not mergeable

@vllm-bot vllm-bot merged commit 44554a0 into vllm-project:main Jul 22, 2025
74 of 80 checks passed
zixi-qi pushed a commit to zixi-qi/vllm that referenced this pull request Jul 23, 2025
avigny pushed a commit to avigny/vllm that referenced this pull request Jul 31, 2025
…m-project#21033)

Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants