Skip to content

Conversation

qgallouedec
Copy link
Member

Padding side was wrong, this PR is a fix
A few cleanup, late review for #3471

cc @ArthurZucker for viz

prompt_ids = [torch.tensor(ids, device=device) for ids in prompt_inputs.input_ids]
prompt_ids = pad(prompt_ids, padding_value=self.processing_class.pad_token_id)
prompt_ids = pad(prompt_ids, padding_value=self.processing_class.pad_token_id, padding_side="left")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Padding side must be "left" for prompts, otherwise you'd get (incorrect):

[[P01, P01, P03, PAD, PAD, C01, C02, C03, C04],
 [P11, P11, P13, P14, P15, C11, C12, PAD, PAD]]

instead of (correct)

[[PAD, PAD, P01, P01, P03, C01, C02, C03, C04],
 [P11, P11, P13, P14, P15, C11, C12, PAD, PAD]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then how we got reasonable output till now? 🧐

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error is only when using CB, introduced in #3471

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines +1235 to +1242
with (
profiling_context(self, "transformers.generate"),
unwrap_model_for_generation(
self.model_wrapped, self.accelerator, gather_deepspeed3_params=self.args.ds3_gather_for_generation
) as unwrapped_model,
torch.no_grad(),
FSDP.summon_full_params(self.model_wrapped, recurse=False) if self.is_fsdp_enabled else nullcontext(),
):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplification, using

with c1, c2:

instead of

with c1:
    with c2:

@qgallouedec qgallouedec changed the title Fix CB in GRPO ↔️ Fix CB in GRPO Jul 11, 2025
@kashif
Copy link
Collaborator

kashif commented Jul 11, 2025

good catch! lets also pin the transformer to the latest patch for the generation length fix?

@kashif
Copy link
Collaborator

kashif commented Jul 11, 2025

can you kindly test with transformers>=4.53.2

@qgallouedec
Copy link
Member Author

can you kindly test with transformers>=4.53.2

done in 17157c4

@qgallouedec qgallouedec merged commit 5a2b04a into main Jul 12, 2025
10 of 11 checks passed
@qgallouedec qgallouedec deleted the late-review-paged branch July 12, 2025 01:21
marcandrelarochelle pushed a commit to marcandrelarochelle/trl that referenced this pull request Jul 29, 2025
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
LuisVasquezBSC pushed a commit to langtech-bsc/trl that referenced this pull request Aug 28, 2025
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
LuisVasquezBSC pushed a commit to langtech-bsc/trl that referenced this pull request Aug 28, 2025
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants