Skip to content

Conversation

nanjiangwill
Copy link
Contributor

Checklist Before Starting

  • Searched for similar PR(s).

What does this PR do?

This PR adds image input to sglang async rollout. Previously sglang async rollout only support text. There is also a placeholder for video data, will be added as an input when SGLang engine supports it.

High-Level Design

Since sglang engine already handle the image input, just need to properly handling the tokenization.

Specific Changes

Change self.tokenizer.apply_chat_template() to self.processing_class.apply_chat_template(). processing_class could be tokenizer or processor.

Usage Example

It will automatically using processor to process image when the model's processor supports that. It will use tokenizer if there is no processor available

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add [BREAKING] to the PR title description if it breaks any API.
  • Update the documentation about your changes in the docs.
  • New CI unit test(s) are added to cover the code path.
  • Rely on existing unit tests on CI that covers the code path.

@nanjiangwill nanjiangwill force-pushed the feat/add-multimodal-multiturn-sglang branch from 2b560a9 to e0d1bf2 Compare June 18, 2025 17:02
@zhaochenyang20
Copy link
Collaborator

@nanjiangwill Is it ready for us to test it in public?

@nanjiangwill
Copy link
Contributor Author

yes it is ready for public testing

@nanjiangwill nanjiangwill requested a review from chenhaiq as a code owner June 19, 2025 17:28
Copy link
Collaborator

@SwordFaith SwordFaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking forward to your reply and the opportunity to discuss this further.

@nanjiangwill nanjiangwill changed the title [sglang] feat: [BREAKING] add multimodal input to multiturn async rollout [BREAKING][sglang] feat: add multimodal input to multiturn async rollout Jun 21, 2025
@zhaochenyang20 zhaochenyang20 changed the title [BREAKING][sglang] feat: add multimodal input to multiturn async rollout [sglang] feat: add multimodal input to multiturn async rollout Jun 21, 2025
Copy link
Collaborator

@zhaochenyang20 zhaochenyang20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazon/Amazing job 😂

Copy link
Collaborator

@SwordFaith SwordFaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Nan

@zhaochenyang20 zhaochenyang20 merged commit 644aaa7 into volcengine:main Jun 22, 2025
43 of 44 checks passed
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 23, 2025
…ngine#2014)

### Checklist Before Starting

- [X] Searched for similar PR(s).

### What does this PR do?
This PR adds image input to sglang async rollout. Previously sglang
async rollout only support text. There is also a placeholder for video
data, will be added as an input when SGLang engine supports it.

### High-Level Design

Since sglang engine already handle the image input, just need to
properly handling the tokenization.

### Specific Changes

Change `self.tokenizer.apply_chat_template()` to
`self.processing_class.apply_chat_template()`. `processing_class` could
be `tokenizer` or `processor`.


### Usage Example
It will automatically using processor to process image when the model's
processor supports that. It will use tokenizer if there is no processor
available

### Checklist Before Submitting

- [X] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [X] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [X] Add `[BREAKING]` to the PR title `description` if it breaks any
API.
- [X] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [X] New CI unit test(s) are added to cover the code path.
- [X] Rely on existing unit tests on CI that covers the code path.

---------

Co-authored-by: xieck13 <xieck13@gmail.com>
Sirius-L1 pushed a commit to Sirius-L1/verl that referenced this pull request Jun 24, 2025
…ngine#2014)

### Checklist Before Starting

- [X] Searched for similar PR(s).

### What does this PR do?
This PR adds image input to sglang async rollout. Previously sglang
async rollout only support text. There is also a placeholder for video
data, will be added as an input when SGLang engine supports it.

### High-Level Design

Since sglang engine already handle the image input, just need to
properly handling the tokenization.

### Specific Changes

Change `self.tokenizer.apply_chat_template()` to
`self.processing_class.apply_chat_template()`. `processing_class` could
be `tokenizer` or `processor`.


### Usage Example
It will automatically using processor to process image when the model's
processor supports that. It will use tokenizer if there is no processor
available

### Checklist Before Submitting

- [X] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [X] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [X] Add `[BREAKING]` to the PR title `description` if it breaks any
API.
- [X] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [X] New CI unit test(s) are added to cover the code path.
- [X] Rely on existing unit tests on CI that covers the code path.

---------

Co-authored-by: xieck13 <xieck13@gmail.com>
Tyizhanshen pushed a commit to HyperdriveHustle/verl that referenced this pull request Jul 1, 2025
…ngine#2014)

### Checklist Before Starting

- [X] Searched for similar PR(s).

### What does this PR do?
This PR adds image input to sglang async rollout. Previously sglang
async rollout only support text. There is also a placeholder for video
data, will be added as an input when SGLang engine supports it.

### High-Level Design

Since sglang engine already handle the image input, just need to
properly handling the tokenization.

### Specific Changes

Change `self.tokenizer.apply_chat_template()` to
`self.processing_class.apply_chat_template()`. `processing_class` could
be `tokenizer` or `processor`.


### Usage Example
It will automatically using processor to process image when the model's
processor supports that. It will use tokenizer if there is no processor
available

### Checklist Before Submitting

- [X] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [X] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [X] Add `[BREAKING]` to the PR title `description` if it breaks any
API.
- [X] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [X] New CI unit test(s) are added to cover the code path.
- [X] Rely on existing unit tests on CI that covers the code path.

---------

Co-authored-by: xieck13 <xieck13@gmail.com>
oseyosey pushed a commit to oseyosey/verl that referenced this pull request Jul 28, 2025
…ngine#2014)

### Checklist Before Starting

- [X] Searched for similar PR(s).

### What does this PR do?
This PR adds image input to sglang async rollout. Previously sglang
async rollout only support text. There is also a placeholder for video
data, will be added as an input when SGLang engine supports it.

### High-Level Design

Since sglang engine already handle the image input, just need to
properly handling the tokenization.

### Specific Changes

Change `self.tokenizer.apply_chat_template()` to
`self.processing_class.apply_chat_template()`. `processing_class` could
be `tokenizer` or `processor`.


### Usage Example
It will automatically using processor to process image when the model's
processor supports that. It will use tokenizer if there is no processor
available

### Checklist Before Submitting

- [X] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [X] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [X] Add `[BREAKING]` to the PR title `description` if it breaks any
API.
- [X] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [X] New CI unit test(s) are added to cover the code path.
- [X] Rely on existing unit tests on CI that covers the code path.

---------

Co-authored-by: xieck13 <xieck13@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants