-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 #3984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@zhaochenyang20 ready to be reviewed. Some inconsistencies in the CI with accuracy but should be good |
@adarshxs thanks. yi and me can help to rerun the CI. @yizhang2077 could you help to review this? |
@zhaochenyang20 @yizhang2077 any update on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adarshxs Sorry I am late. Thanks for your work, I leave some comments here~
@adarshxs great work!!! do not rebase with main, let me rerun for you |
@adarshxs @zhaochenyang20 @yizhang2077 @mickqian You are great!! |
# fix: for Qwen2-VL model, inject default 'size' if not provided. | ||
if config.model_type in {"qwen2_vl"}: | ||
if "size" not in kwargs: | ||
kwargs["size"] = {"shortest_edge": 3136, "longest_edge": 1003520} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to ask about the intention of injecting the default ‘size’ here for the Qwen2-VL model. I noticed that after Transformers version 4.54.0, this injection no longer works. I’m not sure whether I need to adjust it in order to make it work again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as i remember, prior to transformers v4.50.0, Qwen2-VL model's preprocessor_config.json
only contained min_pixels
/max_pixels
and no explicit shortest_edge
or longest_edge
. as a result, loading those models under 4.50.0 would immediately throw ValueError: size must contain 'shortest_edge' and 'longest_edge' keys.
Motivation
Fixes: #3935
Modifications
Add partial rotary embedding support and upgrade to
transformers==4.50.0
Also fix Qwen2.5VL which breaks when upgraded to
transformers==4.50.0
fromtransformers==4.48.3
Also minor fixes to reference_hf.py script
Checklist