Dockerfile.rocm update tensordict==0.6.2 #1898

vickytsang · 2025-06-07T00:02:28Z

Checklist Before Starting

[x ] Search for similar PR(s).

What does this PR do?

Update tensordict version

Resolve PPO training error

python3 -m verl.trainer.main_ppo algorithm.adv_estimator=gae data.train_files=/root/data/gsm8k/train.parquet data.val_files=/root/data/gsm8k/test.parquet data.train_batch_size=256 data.max_prompt_length=512 data.max_response_length=512 data.return_raw_chat=True actor_rollout_ref.model.path=/root/models/Qwen/Qwen2.5-0.5B actor_rollout_ref.model.use_liger=True actor_rollout_ref.actor.optim.lr=1e-6 actor_rollout_ref.model.use_remove_padding=True actor_rollout_ref.actor.optim.lr_warmup_steps_ratio=0.1 actor_rollout_ref.actor.ppo_mini_batch_size=128 actor_rollout_ref.actor.use_dynamic_bsz=False actor_rollout_ref.actor.ppo_max_token_len_per_gpu=32768 actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=2 actor_rollout_ref.actor.ulysses_sequence_parallel_size=1 actor_rollout_ref.actor.fsdp_config.param_offload=False actor_rollout_ref.actor.fsdp_config.optimizer_offload=False actor_rollout_ref.actor.use_kl_loss=False actor_rollout_ref.rollout.log_prob_max_token_len_per_gpu=32768 actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=2 actor_rollout_ref.rollout.tensor_model_parallel_size=2 actor_rollout_ref.rollout.name=vllm actor_rollout_ref.rollout.gpu_memory_utilization=0.8 actor_rollout_ref.ref.log_prob_max_token_len_per_gpu=32768 actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=2 critic.optim.lr=1e-5 critic.ulysses_sequence_parallel_size=1 critic.model.use_remove_padding=True critic.optim.lr_warmup_steps_ratio=0.05 critic.model.path=/root/models/Qwen/Qwen2.5-0.5B critic.model.enable_gradient_checkpointing=False critic.use_dynamic_bsz=False critic.ppo_max_token_len_per_gpu=32768 critic.ppo_micro_batch_size_per_gpu=2 critic.model.fsdp_config.param_offload=False critic.model.fsdp_config.optimizer_offload=False reward_model.enable=True reward_model.ulysses_sequence_parallel_size=1 reward_model.model.path=/root/models/Qwen/Qwen2.5-0.5B reward_model.model.use_remove_padding=True reward_model.model.fsdp_config.param_offload=True reward_model.use_dynamic_bsz=False reward_model.forward_max_token_len_per_gpu=32768 reward_model.micro_batch_size_per_gpu=2 algorithm.use_kl_in_reward=False trainer.critic_warmup=0 'trainer.logger=[console]' trainer.project_name=verl-test trainer.experiment_name=qwen2.5-0.5b-model-reward-minimal trainer.nnodes=1 trainer.n_gpus_per_node=8 trainer.val_before_train=False trainer.test_freq=False trainer.save_freq=-1 trainer.resume_mode=disable trainer.total_epochs=2 trainer.total_training_steps=1
Traceback (most recent call last):
File "", line 189, in _run_module_as_main
File "", line 112, in _get_module_details
File "/sgl-workspace/verl/init.py", line 22, in
from .protocol import DataProto
File "/sgl-workspace/verl/protocol.py", line 30, in
import tensordict
File "/usr/local/lib/python3.12/dist-packages/tensordict/init.py", line 6, in
import tensordict._reductions
File "/usr/local/lib/python3.12/dist-packages/tensordict/_reductions.py", line 11, in
from tensordict._lazy import LazyStackedTensorDict
File "/usr/local/lib/python3.12/dist-packages/tensordict/_lazy.py", line 38, in
from tensordict.memmap import MemoryMappedTensor
File "/usr/local/lib/python3.12/dist-packages/tensordict/memmap.py", line 25, in
from torch.multiprocessing.reductions import ForkingPickler
ImportError: cannot import name 'ForkingPickler' from 'torch.multiprocessing.reductions' (/usr/local/lib/python3.12/dist-packages/torch/multiprocessing/reductions.py)

Checklist Before Submitting

[x ] Read the Contribute Guide.
[x ] Apply pre-commit checks.
Add [BREAKING] to the PR title if it breaks any API.
Update the documentation about your changes in the docs.
New CI unit test(s) are added to cover the code path.
[x ] Rely on existing unit tests on CI that covers the code path.

Signed-off-by: Vicky Tsang <vtsang@amd.com>

### Checklist Before Starting - [x ] Search for similar PR(s). ### What does this PR do? Update tensordict version Resolve PPO training error + python3 -m verl.trainer.main_ppo algorithm.adv_estimator=gae data.train_files=/root/data/gsm8k/train.parquet data.val_files=/root/data/gsm8k/test.parquet data.train_batch_size=256 data.max_prompt_length=512 data.max_response_length=512 data.return_raw_chat=True actor_rollout_ref.model.path=/root/models/Qwen/Qwen2.5-0.5B actor_rollout_ref.model.use_liger=True actor_rollout_ref.actor.optim.lr=1e-6 actor_rollout_ref.model.use_remove_padding=True actor_rollout_ref.actor.optim.lr_warmup_steps_ratio=0.1 actor_rollout_ref.actor.ppo_mini_batch_size=128 actor_rollout_ref.actor.use_dynamic_bsz=False actor_rollout_ref.actor.ppo_max_token_len_per_gpu=32768 actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=2 actor_rollout_ref.actor.ulysses_sequence_parallel_size=1 actor_rollout_ref.actor.fsdp_config.param_offload=False actor_rollout_ref.actor.fsdp_config.optimizer_offload=False actor_rollout_ref.actor.use_kl_loss=False actor_rollout_ref.rollout.log_prob_max_token_len_per_gpu=32768 actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=2 actor_rollout_ref.rollout.tensor_model_parallel_size=2 actor_rollout_ref.rollout.name=vllm actor_rollout_ref.rollout.gpu_memory_utilization=0.8 actor_rollout_ref.ref.log_prob_max_token_len_per_gpu=32768 actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=2 critic.optim.lr=1e-5 critic.ulysses_sequence_parallel_size=1 critic.model.use_remove_padding=True critic.optim.lr_warmup_steps_ratio=0.05 critic.model.path=/root/models/Qwen/Qwen2.5-0.5B critic.model.enable_gradient_checkpointing=False critic.use_dynamic_bsz=False critic.ppo_max_token_len_per_gpu=32768 critic.ppo_micro_batch_size_per_gpu=2 critic.model.fsdp_config.param_offload=False critic.model.fsdp_config.optimizer_offload=False reward_model.enable=True reward_model.ulysses_sequence_parallel_size=1 reward_model.model.path=/root/models/Qwen/Qwen2.5-0.5B reward_model.model.use_remove_padding=True reward_model.model.fsdp_config.param_offload=True reward_model.use_dynamic_bsz=False reward_model.forward_max_token_len_per_gpu=32768 reward_model.micro_batch_size_per_gpu=2 algorithm.use_kl_in_reward=False trainer.critic_warmup=0 'trainer.logger=[console]' trainer.project_name=verl-test trainer.experiment_name=qwen2.5-0.5b-model-reward-minimal trainer.nnodes=1 trainer.n_gpus_per_node=8 trainer.val_before_train=False trainer.test_freq=False trainer.save_freq=-1 trainer.resume_mode=disable trainer.total_epochs=2 trainer.total_training_steps=1 Traceback (most recent call last): File "<frozen runpy>", line 189, in _run_module_as_main File "<frozen runpy>", line 112, in _get_module_details File "/sgl-workspace/verl/__init__.py", line 22, in <module> from .protocol import DataProto File "/sgl-workspace/verl/protocol.py", line 30, in <module> import tensordict File "/usr/local/lib/python3.12/dist-packages/tensordict/__init__.py", line 6, in <module> import tensordict._reductions File "/usr/local/lib/python3.12/dist-packages/tensordict/_reductions.py", line 11, in <module> from tensordict._lazy import LazyStackedTensorDict File "/usr/local/lib/python3.12/dist-packages/tensordict/_lazy.py", line 38, in <module> from tensordict.memmap import MemoryMappedTensor File "/usr/local/lib/python3.12/dist-packages/tensordict/memmap.py", line 25, in <module> from torch.multiprocessing.reductions import ForkingPickler ImportError: cannot import name 'ForkingPickler' from 'torch.multiprocessing.reductions' (/usr/local/lib/python3.12/dist-packages/torch/multiprocessing/reductions.py) ### Checklist Before Submitting - [x ] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x ] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [x ] Rely on existing unit tests on CI that covers the code path. Signed-off-by: Vicky Tsang <vtsang@amd.com>

rocm dockerfile update tensordict==0.6.2

c6c4c1c

Signed-off-by: Vicky Tsang <vtsang@amd.com>

vermouth1992 approved these changes Jun 7, 2025

View reviewed changes

yushengsu-thu self-requested a review June 7, 2025 00:07

yushengsu-thu approved these changes Jun 7, 2025

View reviewed changes

vermouth1992 merged commit d02b3d5 into volcengine:main Jun 7, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dockerfile.rocm update tensordict==0.6.2 #1898

Dockerfile.rocm update tensordict==0.6.2 #1898

Uh oh!

vickytsang commented Jun 7, 2025

Uh oh!

Uh oh!

Uh oh!

Dockerfile.rocm update tensordict==0.6.2 #1898

Dockerfile.rocm update tensordict==0.6.2 #1898

Uh oh!

Conversation

vickytsang commented Jun 7, 2025

Checklist Before Starting

What does this PR do?

Checklist Before Submitting

Uh oh!

Uh oh!

Uh oh!