-
-
Notifications
You must be signed in to change notification settings - Fork 10k
[Model][VLM] Support JinaVL Reranker #20260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @shineran96, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces comprehensive support for the JinaVL Reranker model, significantly expanding the system's capabilities to handle multimodal reranking tasks. It modifies the LLM.score
API to seamlessly integrate image inputs alongside text, enabling users to perform cross-encoding with both modalities. The changes include the core model implementation, specialized prompt formatting for multimodal inputs, and extensive testing to ensure accuracy and reliability.
Highlights
- New Model Support: Introduced support for the
JinaVLForRanking
model, enabling multimodal (text and image) reranking capabilities within the system. - Multimodal
score
API Extension: TheLLM.score
API has been extended to accept image inputs alongside text for cross-encoding tasks, specifically tailored for the JinaVL model. This allows for flexible input combinations like text-image, image-text, and image-image pairs. - Core Model Implementation: Added the
JinaVLForSequenceClassification
model definition undervllm/model_executor/models/jina_vl.py
. This implementation integrates with the existingQwen2VLForConditionalGeneration
and includes a customJinaVLScorer
for the ranking head. - Multimodal Prompt Formatting: A new
formatting_prompts
utility was added tovllm/entrypoints/score_utils.py
. This function is crucial for correctly constructing prompts and handling multimodal data for JinaVL, including inserting image placeholders and appending the specific score token ID (100). - Comprehensive Testing: New tests were added in
tests/models/language/pooling/test_jinavl_reranker.py
to validate the JinaVL Reranker across various multimodal scenarios (text-image, text-text, image-text, image-image), ensuring functional parity with the original Hugging Face implementation. - Documentation and Examples: The
docs/models/supported_models.md
file was updated to list the newly supported JinaVL model, and a new offline inference example script (examples/offline_inference/jinavl_reranker.py
) was provided to demonstrate its usage.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces support for the JinaVL Reranker model within the vLLM framework, including necessary code modifications and tests. The changes primarily involve adding new model classes, updating configurations, and adjusting input processing to accommodate multi-modal data. The code appears well-structured and includes necessary components for integrating the JinaVL model, but some docstrings and error messages could be improved for clarity.
00851cd
to
4f1a1e0
Compare
This pull request has merge conflicts that must be resolved before it can be |
f484fb6
to
f0cb47f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this model! Some initial comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few places that need changes or clarifications but its a nice addition overall. Thanks for contributing.
This is the first supported VL Reranker, thank you for getting it working. |
e592e5d
to
55779e0
Compare
This pull request has merge conflicts that must be resolved before it can be |
55779e0
to
fa12128
Compare
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Head branch was pushed to by a user without write access
efa7843
to
2a8adfe
Compare
There are some code updating involving this PR in upstream branch, so I have to rebase and do some code changes. |
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com> Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com> Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
I encounter this: Following weights were not initialized from checkpoint: {'language_model.score.weight'} I use vllm 0.10.0 |
I think this problem should be fixed on main branch, you can try installing vLLM from source or use the nightly Docker image |
Thankyou ,fixed |
Signed-off-by: shineran96 <shinewang96@gmail.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Signed-off-by: shineran96 <shinewang96@gmail.com>
Purpose
Support JinaVL Reranker Model
JinaVLForRanking
architectures ✅multimodal
forLLM.score
✅multimodal
forv1/score
✅multimodal
forv1/rerank
✅Usage
Offline Inference
Output as follows
Online Serving
Test
v1/score
with curlResponse as follows
Test
v1/rerank
with curlResponse as follows
Fix #18447