feat: Pass trainer state to reward functions #3669

seungduk-yanolja · 2025-06-30T07:00:33Z

This allows for implementing dynamic reward strategies like curriculum learning, where rewards can be adjusted based on training progress. Fixes #3668

This allows for implementing dynamic reward strategies like curriculum learning, where rewards can be adjusted based on training progress. Fixes huggingface#3668

tests/test_grpo_trainer.py

kashif · 2025-06-30T10:18:13Z

small nit with the test.. but seemslike a useful thing to have!

HuggingFaceDocBuilderDev · 2025-06-30T10:21:26Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

tests/test_grpo_trainer.py

trl/trainer/grpo_trainer.py

seungduk-yanolja · 2025-07-01T11:17:01Z

Does it require approval whenever I create a merge commit?

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

seungduk-yanolja force-pushed the trainer_state branch 6 times, most recently from 79ff77a to f67602f Compare June 30, 2025 07:48

feat: Pass trainer state to reward functions

ba631ea

This allows for implementing dynamic reward strategies like curriculum learning, where rewards can be adjusted based on training progress. Fixes huggingface#3668

seungduk-yanolja force-pushed the trainer_state branch from f67602f to ba631ea Compare June 30, 2025 07:51

kashif reviewed Jun 30, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

Update tests/test_grpo_trainer.py

1e5dd41

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

kashif approved these changes Jun 30, 2025

View reviewed changes

seungduk-yanolja commented Jun 30, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

Update tests/test_grpo_trainer.py

a5ce51a

kashif reviewed Jul 1, 2025

View reviewed changes

trl/trainer/grpo_trainer.py Outdated Show resolved Hide resolved

Update trl/trainer/grpo_trainer.py

cfe0649

kashif reviewed Jul 1, 2025

View reviewed changes

trl/trainer/grpo_trainer.py Outdated Show resolved Hide resolved

Update trl/trainer/grpo_trainer.py

e2c4aa9

kashif approved these changes Jul 1, 2025

View reviewed changes

Merge branch 'main' into trainer_state

14d44d4

kashif merged commit e04f7eb into huggingface:main Jul 1, 2025

marcandrelarochelle pushed a commit to marcandrelarochelle/trl that referenced this pull request Jul 29, 2025

feat: Pass trainer state to reward functions (huggingface#3669)

3959596

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Pass trainer state to reward functions #3669

feat: Pass trainer state to reward functions #3669

Uh oh!

seungduk-yanolja commented Jun 30, 2025

Uh oh!

Uh oh!

kashif commented Jun 30, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seungduk-yanolja commented Jul 1, 2025

Uh oh!

Uh oh!

feat: Pass trainer state to reward functions #3669

feat: Pass trainer state to reward functions #3669

Uh oh!

Conversation

seungduk-yanolja commented Jun 30, 2025

Uh oh!

Uh oh!

kashif commented Jun 30, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seungduk-yanolja commented Jul 1, 2025

Uh oh!

Uh oh!