Multiple processor classes mutate their `text` input when it's a list. Example: https://github.com/huggingface/transformers/blob/42c489f2ae738a3b690bb90aab274f02ff024795/src/transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py#L156C21-L156C25 This results in unwanted downstream behaviour. For example, see [this comment](https://github.com/huggingface/trl/pull/3072#issuecomment-2741246702). This behaviour shouldn't be handled neither by TRL nor vLLM, in my opinion. ### Who can help? @ArthurZucker @qubvel ### Reproduction The case of Qwen2.5-VL can be tested [here](https://gist.github.com/nph4rd/f003323ac4c8940f779f44a24b815ff7). ### Expected behavior Ideally the function should have no input side-effects.