Skip to content

Conversation

edbeeching
Copy link
Collaborator

@edbeeching edbeeching commented Mar 31, 2025

What does this PR do?

  • Adds min, mean and max logging of completion lengths
  • Also logs min, mean, max completion logs for sequences that terminate in EOS
  • Adds clipped completion ratio from SimpleRL-Zero
    Example:
    image

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a suggestion to also log the clip fraction of truncated completions

self._metrics[mode]["max_completion_length"].append(max_completion_length)

# identify sequences that terminated with EOS and log their lengths
agg_terminated_with_eos = self.accelerator.gather_for_metrics(is_eos.any(dim=1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now have the sequences with / without EOS, WDYT about also logging clip_ratio_completion_length or truncated_completion_length_ratio as suggested in SimpleRL-Zero:

Screenshot 2025-03-31 at 11 24 36

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just num_terminated_completions / num_total_completions ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's num_truncated_completions / num_total_completions (i.e. a high clip ratio means a lot of completions are too verbose)

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Can you just add these new metrics to the doc (there is a section for this "logged metrics")

@edbeeching edbeeching merged commit 9f3702f into main Apr 1, 2025
7 of 10 checks passed
@edbeeching edbeeching deleted the improve-completions-logging branch April 1, 2025 08:00
@lewtun lewtun mentioned this pull request Apr 4, 2025
5 tasks
BjarniHaukur pushed a commit to ASSERT-KTH/trl that referenced this pull request Apr 15, 2025
…ggingface#3131)

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

log answer key to wandb

all Table

HTML logging

table

bump patch

hmm

formatting

html esacape

reward isnt string

[Liger] Liger KTO support (huggingface#2812)

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

🏃 Migrate CI to self-hosted runners (huggingface#3174)

❤️‍🩹 [CI] fix transformers dev CI failure (huggingface#3176)

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

⏯️ Fix: handle None inputs when resuming GRPO Trainer from checkpoint (huggingface#3148)

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

📎 Fix is_clipped to compute the effective clip_ratio (huggingface#3175)

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

Fix breaking typo for flash_attention reducing_memory_usage.md (huggingface#3190)

Show unique prompts in GRPO WandB tables (huggingface#3191)

🐗 [CI] Fix trufflehog false positives (huggingface#3192)

[GRPO] Improve completion length logging (huggingface#3188)

preliminary openai compatible endpoint

early concept, needs refining

dedupe

debug print

some slop to work on

unslop, missing hist

almost valid pseudocode

middle-ware monkey patch in mp.Pool()...

remove unused

More accurate .md

need gpu

renting lambda again

much nicer

small

aider-chat and datasets conflict

risky reqs change

should work, but hacky

some insights, but monkeypatching probably wont suffice

refactor: Rewrite test script to use SWE-bench dataset with MultiProcessAider

refactor: Remove logging statements from test.py

one step closer

finally, the correct abstraction

doc

todo

unslop

unslop

undo accidental black

cleaner abstraction

new abstraction
yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants