Replies: 3 comments 6 replies
-
can we support pangu model in the future? |
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi,instead of above meantioned features, we are looking forward to see following features supported in verl on Ascend device: Inference Backend
Training Backend & Specific Features
Models & Algorithms
Others
|
Beta Was this translation helpful? Give feedback.
4 replies
-
I use vllm-ascend to run grpo+multi-turn function calling, but i found that tool is not be called. Is not support npu + multi-turn function calling? thx! # run on 4xH100
# make sure your current working directory is the root of the project
set -x
export HYDRA_FULL_ERROR=1
export VLLM_USE_V1=1
ulimit -n 65535
PROJECT_DIR="$(pwd)"
CONFIG_PATH="$PROJECT_DIR/examples/sglang_multiturn/config"
python3 -u -m verl.trainer.main_ppo \
--config-path="$CONFIG_PATH" \
--config-name='gsm8k_multiturn_grpo' \
algorithm.adv_estimator=grpo \
data.train_batch_size=128 \
data.max_prompt_length=1024 \
data.max_response_length=1024 \
data.filter_overlong_prompts=True \
data.truncation='error' \
data.return_raw_chat=True \
actor_rollout_ref.model.path=/vllm-workspace/Qwen2.5-3B-Instruct \
actor_rollout_ref.model.use_remove_padding=True \
actor_rollout_ref.actor.use_torch_compile=False \
actor_rollout_ref.actor.optim.lr=1e-6 \
actor_rollout_ref.actor.ppo_mini_batch_size=64 \
actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=16 \
actor_rollout_ref.actor.use_kl_loss=True \
actor_rollout_ref.actor.kl_loss_coef=0.001 \
actor_rollout_ref.actor.kl_loss_type=low_var_kl \
actor_rollout_ref.actor.entropy_coeff=0 \
actor_rollout_ref.model.enable_gradient_checkpointing=True \
actor_rollout_ref.actor.fsdp_config.param_offload=False \
actor_rollout_ref.actor.fsdp_config.optimizer_offload=False \
actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=16 \
actor_rollout_ref.rollout.tensor_model_parallel_size=4 \
actor_rollout_ref.rollout.name=vllm \
actor_rollout_ref.rollout.mode=async \
actor_rollout_ref.rollout.gpu_memory_utilization=0.5 \
actor_rollout_ref.rollout.n=8 \
actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=16 \
actor_rollout_ref.ref.fsdp_config.param_offload=True \
algorithm.use_kl_in_reward=False \
trainer.val_before_train=True \
trainer.critic_warmup=0 \
trainer.logger='["console", "swanlab"]' \
trainer.project_name='gsm8k_async_rl_0904' \
trainer.experiment_name='qwen2.5-3b_function_rm-gsm8k-async-vllm-ascend-multi-w-tool-verify-n16-4cards_0904' \
trainer.n_gpus_per_node=4 \
trainer.nnodes=1 \
trainer.save_freq=10 \
trainer.test_freq=10 \
trainer.total_training_steps=100 \
actor_rollout_ref.actor.ppo_max_token_len_per_gpu=8192 \
actor_rollout_ref.rollout.log_prob_max_token_len_per_gpu=8192 \
actor_rollout_ref.ref.log_prob_max_token_len_per_gpu=8192 \
critic.ppo_max_token_len_per_gpu=8192 \
critic.forward_max_token_len_per_gpu=8192 \
data.train_files=$HOME/data/gsm8k/train.parquet \
data.val_files=$HOME/data/gsm8k/test.parquet \
actor_rollout_ref.rollout.multi_turn.enable=True \
actor_rollout_ref.rollout.multi_turn.format=hermes \
actor_rollout_ref.rollout.multi_turn.tool_config_path="$PROJECT_DIR/examples/sglang_multiturn/config/tool_config/gsm8k_tool_config.yaml" \
actor_rollout_ref.rollout.multi_turn.max_user_turns=1 \
trainer.device=npu $@ |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Unfinished tasks in Q2
megatron
/mindspeed
worker (for npu, megatron≈mindspeed)Q2 roadmap: #900
New features in Q3
FSDP2
workerBeta Was this translation helpful? Give feedback.
All reactions