Skip to content

Pull requests: OpenRLHF/OpenRLHF

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

CLI support for top_k
#1104 opened Aug 13, 2025 by JoNeedsSleep Loading…
Update utils.py
#1101 opened Aug 6, 2025 by LiyuanLucasLiu Loading…
Reward model outputting one reward per rollout.
#1037 opened May 28, 2025 by NotTheStallion Loading…
Merge lmm-r1 for Multimodal PPO
#989 opened Apr 23, 2025 by TideDra Loading…
Fix Tokenizer Behavior for Special Placeholder Token
#894 opened Mar 20, 2025 by YuchenFan48 Loading…
Support unbiased off-policy GRPO
#840 opened Mar 7, 2025 by LYMDLUT Loading…
Add support for max time per run
#711 opened Feb 5, 2025 by titu1994 Loading…
Support SFT and DPO training for Qwen2VL
#665 opened Jan 10, 2025 by LiuXTao Loading…
Support rl logging board
#658 opened Jan 9, 2025 by HarderThenHarder Loading…
Ensure train datasets do not contain eval datasets
#594 opened Dec 17, 2024 by dingyuan-shi Loading…
Support broadcast vllm params by chunks
#593 opened Dec 17, 2024 by zhuzilin Loading…
Make sure there is always _some_ eval data
#582 opened Dec 13, 2024 by frrad Loading…
Support TRL's RLOO
#553 opened Dec 4, 2024 by songxxzp Loading…
ProTip! Updated in the last three days: updated:>2025-09-04.