[ORPO] fix orpo chosen-nll loss #2502

kashif · 2024-12-19T10:09:22Z

What does this PR do?

Calculate the ORPO chosen nll loss with respect to the chosen completion only rather than the whole prompt+compeletion.

Also return the shifted logits when the model is decoder only

HuggingFaceDocBuilderDev · 2024-12-19T10:13:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-12-19T10:32:23Z

trl/trainer/orpo_trainer.py

-            attention_mask = concatenated_batch["concatenated_attention_mask"]
-            labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)
-
+        labels = concatenated_batch["concatenated_labels"].clone()


Yes, checked together, if you do

labels = concatenated_batch["concatenated_input_ids"].clone() attention_mask = concatenated_batch["concatenated_attention_mask"] labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)

you don't ignore the prompt.

fix orpo chosen-nll loss

495bcac

kashif requested a review from qgallouedec December 19, 2024 10:09

kashif mentioned this pull request Dec 19, 2024

🐯 [Liger] add native liger-kernel ORPO loss #2482

Open

qgallouedec reviewed Dec 19, 2024

View reviewed changes

qgallouedec approved these changes Dec 19, 2024

View reviewed changes

kashif merged commit 88ad1a0 into main Dec 19, 2024
14 checks passed

kashif deleted the orpo-nll-fix branch December 19, 2024 10:33

kashif mentioned this pull request Dec 28, 2024

↩️ Revert ORPO loss changes #2527

Merged

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

fix orpo chosen-nll loss (huggingface#2502)

c12d133

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ORPO] fix orpo chosen-nll loss #2502

[ORPO] fix orpo chosen-nll loss #2502

Uh oh!

kashif commented Dec 19, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

qgallouedec Dec 19, 2024

Uh oh!

Uh oh!

Uh oh!

[ORPO] fix orpo chosen-nll loss #2502

[ORPO] fix orpo chosen-nll loss #2502

Uh oh!

Conversation

kashif commented Dec 19, 2024

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

qgallouedec Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!