🎭 Fix train and eval mode checking in `GRPOTrainer` and `SFTTrainer` #3337

I-l-l-I · 2025-04-22T05:01:05Z

What does this PR do?

Use model.training instead of control.should_evaluate to determine if the model is in evaluation mode.

Motivation

It is not accurate to use should_evaluate to determine the mode. For example, if the trainer needs to evaluate and log at a certain training step, then log will be executed twice at this step. The first log should be for the training metrics, but at this time should_evaluate=True, since should_evaluate means to evaluate after one step.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

qgallouedec

That's a very good point @I-l-l-I!

I've tested myself I can confirm that the current implementation has indeed the issue you describe, and that your solution works like a charm.
I've also fixed it in SFT.

HuggingFaceDocBuilderDev · 2025-04-25T23:44:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…3337) Co-authored-by: Jiaming Ma <jiaming.ma@connect.polyu.hk> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

Jiaming Ma and others added 3 commits April 22, 2025 12:40

Fix train and eval mode checking in GRPOTrainer

f0a4cd5

Merge branch 'main' into fix-mode

b646cc3

fix mode in sft

e1f2752

qgallouedec approved these changes Apr 25, 2025

View reviewed changes

qgallouedec changed the title ~~Fix train and eval mode checking in GRPOTrainer~~ 🎭 Fix train and eval mode checking in GRPOTrainer and SFTTrainer Apr 25, 2025

qgallouedec merged commit 39e9639 into huggingface:main Apr 26, 2025
9 checks passed

hjh0119 mentioned this pull request Apr 29, 2025

updates GRPOTrainer compatible with trl 0.17 modelscope/ms-swift#3969

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🎭 Fix train and eval mode checking in `GRPOTrainer` and `SFTTrainer` #3337

🎭 Fix train and eval mode checking in `GRPOTrainer` and `SFTTrainer` #3337

Uh oh!

I-l-l-I commented Apr 22, 2025

Uh oh!

qgallouedec left a comment •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 25, 2025

Uh oh!

Uh oh!

Uh oh!

🎭 Fix train and eval mode checking in GRPOTrainer and SFTTrainer #3337

🎭 Fix train and eval mode checking in GRPOTrainer and SFTTrainer #3337

Uh oh!

Conversation

I-l-l-I commented Apr 22, 2025

What does this PR do?

Motivation

Before submitting

Who can review?

Uh oh!

qgallouedec left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 25, 2025

Uh oh!

Uh oh!

Uh oh!

🎭 Fix train and eval mode checking in `GRPOTrainer` and `SFTTrainer` #3337

🎭 Fix train and eval mode checking in `GRPOTrainer` and `SFTTrainer` #3337

qgallouedec left a comment •

edited

Loading