Enable number of printed completions to be set #3149

lewtun · 2025-03-24T08:51:59Z

What does this PR do?

This PR exposes a num_completions_to_print arg in the GRPOConfig so that users can control how many completions are printed to the terminal. I found the default (log everything) made the logs very verbose / large.

After discussing with @edbeeching I decided not to expose this for WandB since there we don't have to worry about the logs becoming overloaded.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

lewtun · 2025-03-24T08:52:50Z

trl/trainer/grpo_trainer.py

@@ -905,6 +907,8 @@ def _generate_and_score_completions(
                        "reward": rewards.tolist(),
                    }
                    df = pd.DataFrame(table)
+                    if self.num_completions_to_log is not None:


For consistency, I've enabled subsampling here but we could skip it for WandB since it doesn't really matter if there's a lot of completions AFAIK

I think for wandb we should just keep everything? It is more of an issue of the logs being spammed

Sounds good, I'll revert

HuggingFaceDocBuilderDev · 2025-03-24T08:55:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

trl/trainer/utils.py

edbeeching

LGTM apart from comment What happens with num_samples=0 ?

trl/trainer/grpo_config.py

trl/trainer/utils.py

trl/trainer/grpo_config.py

qgallouedec

LGTM :)

qgallouedec · 2025-03-24T13:00:22Z

The CI failing is not related btw, you can safely ignore

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

… print)

…e (don't print)" This reverts commit f6f93c7.

qgallouedec · 2025-03-24T23:07:41Z

>>> print_prompt_completions_sample(prompts, completions, rewards, 42, -1)
>>> print_prompt_completions_sample(prompts, completions, rewards, 42, 0)
>>> print_prompt_completions_sample(prompts, completions, rewards, 42, 1)
╭────────────── Step 42 ───────────────╮
│ ┏━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┓ │
│ ┃ Prompt     ┃ Completion ┃ Reward ┃ │
│ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━┩ │
│ │ The sky is │  blue.     │   0.12 │ │
│ └────────────┴────────────┴────────┘ │
╰──────────────────────────────────────╯
>>> print_prompt_completions_sample(prompts, completions, rewards, 42, 2)
╭─────────────── Step 42 ────────────────╮
│ ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┓ │
│ ┃ Prompt     ┃ Completion   ┃ Reward ┃ │
│ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━┩ │
│ │ The sky is │  blue.       │   0.12 │ │
│ ├────────────┼──────────────┼────────┤ │
│ │ The sun is │  in the sky. │   0.69 │ │
│ └────────────┴──────────────┴────────┘ │
╰────────────────────────────────────────╯
>>> print_prompt_completions_sample(prompts, completions, rewards, 42, 3)
╭─────────────── Step 42 ────────────────╮
│ ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┓ │
│ ┃ Prompt     ┃ Completion   ┃ Reward ┃ │
│ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━┩ │
│ │ The sky is │  blue.       │   0.12 │ │
│ ├────────────┼──────────────┼────────┤ │
│ │ The sun is │  in the sky. │   0.69 │ │
│ └────────────┴──────────────┴────────┘ │
╰────────────────────────────────────────╯

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

Enable number of logged completions to be set

295aab7

lewtun requested review from qgallouedec and edbeeching March 24, 2025 08:52

lewtun commented Mar 24, 2025

View reviewed changes

edbeeching reviewed Mar 24, 2025

View reviewed changes

trl/trainer/utils.py Outdated Show resolved Hide resolved

edbeeching approved these changes Mar 24, 2025

View reviewed changes

lewtun added 2 commits March 24, 2025 09:24

Add post init on num_completions_to_log

27f7518

Remove wandb constraint:

41a14f4

lewtun changed the title ~~Enable number of logged completions to be set~~ Enable number of printed completions to be set Mar 24, 2025

qgallouedec reviewed Mar 24, 2025

View reviewed changes

trl/trainer/grpo_config.py Outdated Show resolved Hide resolved

qgallouedec reviewed Mar 24, 2025

View reviewed changes

trl/trainer/utils.py Outdated Show resolved Hide resolved

qgallouedec reviewed Mar 24, 2025

View reviewed changes

trl/trainer/utils.py Outdated Show resolved Hide resolved

qgallouedec reviewed Mar 24, 2025

View reviewed changes

trl/trainer/grpo_config.py Outdated Show resolved Hide resolved

qgallouedec approved these changes Mar 24, 2025

View reviewed changes

lewtun and others added 6 commits March 24, 2025 23:32

Apply suggestions from code review

8b302e9

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Merge branch 'main' into set-log-samples

e8c0764

fix 1. when num_samples > len(prompts) and 2. num_samples=None (don't…

f6f93c7

… print)

Revert "fix 1. when num_samples > len(prompts) and 2. num_samples=Non…

6aa87de

…e (don't print)" This reverts commit f6f93c7.

Handle num_samples <= len(prompts) and num_samples >= len(prompts)

af006c0

fix num_samples >= len(prompts)

83e5776

lewtun merged commit 1a9387b into main Mar 25, 2025
9 of 14 checks passed

lewtun deleted the set-log-samples branch March 25, 2025 07:47

qgallouedec mentioned this pull request Mar 26, 2025

improvement(utils.py): simplify repeating completion string #3122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable number of printed completions to be set #3149

Enable number of printed completions to be set #3149

Uh oh!

lewtun commented Mar 24, 2025 •

edited

Loading

Uh oh!

lewtun Mar 24, 2025

Uh oh!

edbeeching Mar 24, 2025

Uh oh!

lewtun Mar 24, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Mar 24, 2025

Uh oh!

Uh oh!

edbeeching left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Enable number of printed completions to be set #3149

Enable number of printed completions to be set #3149

Uh oh!

Conversation

lewtun commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

lewtun Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

edbeeching Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

lewtun Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 24, 2025

Uh oh!

Uh oh!

edbeeching left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

lewtun commented Mar 24, 2025 •

edited

Loading