quantile() input type error when using bf16

### Reproduction

Entropies is a bfloat16 tensor when training with bf16, but quantile() input tensor must be either float or double dtype. https://github.com/huggingface/trl/blob/6a6d4345c9e0ded5bdcfc67ca2d8d20ecb75d309/trl/trainer/grpo_trainer.py#L1395



### System Info

RuntimeError: quantile() input tensor must be either float or double dtype

### Checklist

- [ ] I have checked that my issue isn't already filed (see [open issues](https://github.com/huggingface/trl/issues?q=is%3Aissue))
- [ ] I have included my system information
- [ ] Any code provided is minimal, complete, and reproducible ([more on MREs](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [ ] Any code provided is properly formatted in code blocks, (no screenshot, [more on code blocks](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks))
- [ ] Any traceback provided is complete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

quantile() input type error when using bf16 #3666

Reproduction

System Info

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

quantile() input type error when using bf16 #3666

Description

Reproduction

System Info

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions