generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
🏋 Online DPORelated to Online DPORelated to Online DPO🐛 bugSomething isn't workingSomething isn't working
Description
Reproduction
trl/trl/trainer/online_dpo_trainer.py
Line 511 in 68db24e
logits = output.logits[:, prompt_ids.size(1) - 1 : -1] |
The prompt_ids.size(1)
maybe 0 when completion_ids.size(1) >= self.max_length
.
fixed codes:
prompt_length = prompt_ids.size(1)
if prompt_length > 0:
logits = output.logits[:, prompt_length - 1 : -1]
else:
logits = output.logits
System Info
nothing
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete
Metadata
Metadata
Assignees
Labels
🏋 Online DPORelated to Online DPORelated to Online DPO🐛 bugSomething isn't workingSomething isn't working