InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

This repository open-sources the training code of InternVL3.5 during the online RL stage, which is built upon the PR in verl. Compared to the original PR, we have corrected the dialogue template for InternVL and updated a monkey patch for InternVL to enable sequence-parallel. For training details, please refer to the provided scripts.

We use MMPR-Tiny as the training dataset and initialize the model with InternVL3.5 trained after MPO. We also provide a packaged conda environment for easy reproduction.

For the original README of verl, please refer to this file.

Experimental Results

Based on this codebase, the InternVL3.5 series across all model scales achieve a significant improvement in reasoning performance.

Quick Start

Training

We open-source our training data (i.e., MMPR-Tiny) on HuggingFace. To reproduce our training results, you need to download this dataset and move it into this folder. Additionally, considering that verl requires a validation dataset to be loaded, please prepare this data using this script.

├── MMPR-Tiny
│   ├── images
│   └── mmpr_tiny.parquet
├── verl_data
│   └── geo3k
│       └── test.parquet
├── verl
└── README.md

We also provide a packaged conda environment for easy reproduction. After preparing the dataset and conda environment, you can launch the training using the commond as follows:

sh shell/internvl3_5_8b.sh

Evaluation

We mainly use VLMEvalkit to evaluate our models. Please refer to their documentation and our model configs for more details. As an example, you can set the config as follows:

"InternVL3_5-8B-Thinking": partial(
    InternVLChat,
    model_path="/path/to/your/moel",
    version="V2.0",
    cot_prompt_version="r1",
    max_new_tokens=32768,
    do_sample=True,
    use_lmdeploy=True,
),

Citation

If you find this project useful in your research, please consider citing:

@article{wang2025internvl3_5,
  title={InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency},
  author={Wang, Weiyun and Gao, Zhangwei and Gu, Lixin and Pu, Hengjun and Cui, Long and Wei, Xingguang and Liu, Zhaoyang and Jing, Linglin and Ye, Shenglong and Shao, Jie and others},
  journal={arXiv preprint arXiv:2508.18265},
  year={2025}
}
@article{wang2024mpo,
  title={Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization},
  author={Wang, Weiyun and Chen, Zhe and Wang, Wenhai and Cao, Yue and Liu, Yangzhou and Gao, Zhangwei and Zhu, Jinguo and Zhu, Xizhou and Lu, Lewei and Qiao, Yu and Dai, Jifeng},
  journal={arXiv preprint arXiv:2411.10442},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 950 Commits
.github		.github
.vscode		.vscode
docker		docker
docs		docs
examples		examples
recipe		recipe
scripts		scripts
shell		shell
tests		tests
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
README_verl.md		README_verl.md
pyproject.toml		pyproject.toml
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Experimental Results

Quick Start

Training

Evaluation

Citation

About

Uh oh!

Releases

Packages

Contributors 256

Uh oh!

Languages

License

Weiyun1025/verl-internvl

Folders and files

Latest commit

History

Repository files navigation

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Experimental Results

Quick Start

Training

Evaluation

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 256

Uh oh!

Languages

Packages