Skip to content

Multiple processor classes have input side-effects #36865

@nph4rd

Description

@nph4rd

Multiple processor classes mutate their text input when it's a list.

Example:

https://github.com/huggingface/transformers/blob/42c489f2ae738a3b690bb90aab274f02ff024795/src/transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py#L156C21-L156C25

This results in unwanted downstream behaviour. For example, see this comment.

This behaviour shouldn't be handled neither by TRL nor vLLM, in my opinion.

Who can help?

@ArthurZucker @qubvel

Reproduction

The case of Qwen2.5-VL can be tested here.

Expected behavior

Ideally the function should have no input side-effects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions