[mcore] verl+megatron development tracking

# veRL Megatron-core Development Tracking
This page focuses on development of verl+mcore. 
The milestone target is to enable training deepseek-v3 on veRL as #708 and the further target is to continuously enhance the verl training experience of the mcore backend.

## Progress and TODO
Recent 
- [x] update mcore version to 0.11 #392 
- [x] use mcore `GPTModel` api instead of huggingface workaround with sequence packing #706 
- [x] support context parallel #970 
- [x] support loading mcore dist_checkpointing #1030 
- [x] support Megatron 0.11.0 and vLLM 0.8.2 #851 
- [x] support qwen2moe training #1139 
- [x] support `Moonlight-16B-A3B` training (WIP) #1284 
- [x] support `Qwen2.5-VL` training #1286 
- [x] support EP(expert parallel) #1467 

Further
- [ ] FP8 training
- [ ] training efficiency related optimization
- [ ] support sglang inference engine
- [ ] support trtllm inference engine




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mcore] verl+megatron development tracking #1033

veRL Megatron-core Development Tracking

Progress and TODO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[mcore] verl+megatron development tracking #1033

Description

veRL Megatron-core Development Tracking

Progress and TODO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions