generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Description
System Info
- Platform: Linux-3.10.0-693.11.6.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.9.5
- PyTorch version: 2.4.0
- CUDA device(s): not available
- Transformers version: 4.46.2
- Accelerate version: 1.1.1
- Accelerate config: not found
- Datasets version: 3.1.0
- HF Hub version: 0.26.2
- TRL version: 0.13.0.dev0
- bitsandbytes version: not installed
- DeepSpeed version: 0.15.4
- Diffusers version: not installed
- Liger-Kernel version: not installed
- LLM-Blender version: not installed
- OpenAI version: 1.54.4
- PEFT version: not installed
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder - My own task or dataset (give details below)
Reproduction
While reproducing RLOO using a multi-GPU setup with official script, training consistently halts midway, regardless of whether it's set for 1,000 or 1 million episodes. An example wandb run that ended with 1954 steps, whereas it should 3908.
Expected behavior
Should have run for 3908, or possible step miscalculation.
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete
Metadata
Metadata
Assignees
Labels
No labels