Replies: 4 comments
-
Currently verl adopts DeepSpeed Ulysses for long-context training. Ulysses should be natively compatible with Ascend NPU since it relies on all2all communication |
Beta Was this translation helpful? Give feedback.
-
Hi hiyouga, thanks for your reminder.😊 DeepSpeed Ulysses in verl is included in GRPO algo of Qwen series models. Next we will verify it and publish results, so stay tuned. |
Beta Was this translation helpful? Give feedback.
-
Q3 roadmap: #2171 |
Beta Was this translation helpful? Give feedback.
-
The environment has been configured successfully. But the operation got stuck at WARNING:2025-07-11 11:28:53,278:Waiting for register center actor aTAAB0_register_center to be ready. Elapsed time: 0 seconds out of 300 seconds. May I ask what the possible reason might be? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Contents
Native support for verl on Ascend NPU has attracted the attention of some developers. This roadmap shows the progress of native support, welcome to join in the discussion.
Quick Start
document: ascend_quick_start.rst
Plan
Dependencies (Q1 done)
transformers
ray
FSDP
workervLLM-ascend v0.7.3
(Some features have been temporarily circumvented and marked in the Q2 Plan)Q2 Plan
--use_remove_padding
megatron
/mindspeed
worker (for npu, megatron≈mindspeed)Release Accuracy Comparison Results
Modify the default config as little as possible to keep the accuracy.
Easy of use
flash-attn
is not supported on Ascend NPU. So we need to usetorch_npu.npu_fusion_attention
to replace.[Temporary solution] NPU support SDPA: NPU support SDPA huggingface/transformers#35165Long-term Planning
Beta Was this translation helpful? Give feedback.
All reactions