-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
System Info
!pip install git+https://github.com/huggingface/trl.git
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder - My own task or dataset (give details below)
Reproduction
!ACCELERATE_LOG_LEVEL=info accelerate launch --config_file multi_gpu.yaml
online_dpo.py
--model_name_or_path mistralai/Mistral-7B-v0.1
--reward_model_path Ray2333/GRM-Llama3.2-3B-rewardmodel-ft
--dataset_name nvidia/HelpSteer2
--learning_rate 5.0e-6
--output_dir pythia-1b-tldr-online-dpo
--per_device_train_batch_size 16
--gradient_accumulation_steps 8
--warmup_ratio 0.1
--missing_eos_penalty 1.0
--use_peft
Traceback (most recent call last):
File "/home/ec2-user/SageMaker/Zhichao/UNA_online/UNA_peft/una_peft.py", line 356, in
[2024-11-28 16:59:10,071] [INFO] [config.py:999:print] DeepSpeedEngine configuration:
trainer = OnlineDPOTrainer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
return func(*args, **kwargs)
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/trl/trainer/online_dpo_trainer.py", line 286, in init
File "/home/ec2-user/SageMaker/Zhichao/UNA_online/UNA_peft/una_peft.py", line 356, in
self.ref_model = prepare_deepspeed(self.ref_model, args.per_device_train_batch_size, args.fp16, args.bf16)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/trl/trainer/utils.py", line 1212, in prepare_deepspeed
trainer = OnlineDPOTrainer(
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
return func(*args, **kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/trl/trainer/online_dpo_trainer.py", line 286, in init
model, *_ = deepspeed.initialize(model=model, config=config_kwargs)
File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/deepspeed/init.py", line 139, in initialize
assert model is not None, "deepspeed.initialize requires a model"
AssertionErrorself.ref_model = prepare_deepspeed(self.ref_model, args.per_device_train_batch_size, args.fp16, args.bf16):
deepspeed.initialize requires a model File "/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/trl/trainer/utils.py", line 1212, in prepare_deepspeed
Expected behavior
It should be able to run.
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete