Unnecessary breaking change in `SFTTrainer._prepare_dataset` from 0.19.0 compared to 0.18.2

### Reproduction

In #3572 @qgallouedec simplified processing conversational data.
However it also alters the interaction with the tokenizer because it changes from accessing it in a item access (`processed["input_ids"]`) to an attribute access (`processed.input_ids`). `processed` is an output of the tokenizer. But the tokenizer is not necessarily under the control of the library since it is user-provided and may be custom.

Is this an intentional breaking change? If yes, why? It forces users to write their tokenizers to return `BatchEncoding` rather than plain dicts.


This PR was merged between 0.18.2 and 0.19.0

I am referring to this line: https://github.com/huggingface/trl/blob/ab331bfd562d0471aba1eaf694adc6d6ea650202/trl/trainer/sft_trainer.py#L731


### System Info

v 0.19.0

### Checklist

- [x] I have checked that my issue isn't already filed (see [open issues](https://github.com/huggingface/trl/issues?q=is%3Aissue))
- [x] I have included my system information
- [x] Any code provided is minimal, complete, and reproducible ([more on MREs](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [x] Any code provided is properly formatted in code blocks, (no screenshot, [more on code blocks](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [x] Any traceback provided is complete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unnecessary breaking change in `SFTTrainer._prepare_dataset` from 0.19.0 compared to 0.18.2 #3641

Reproduction

System Info

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unnecessary breaking change in SFTTrainer._prepare_dataset from 0.19.0 compared to 0.18.2 #3641

Description

Reproduction

System Info

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unnecessary breaking change in `SFTTrainer._prepare_dataset` from 0.19.0 compared to 0.18.2 #3641