Skip to content

Multi-turn rollout & agentic RL Status & Roadmap #131

@SwordFaith

Description

@SwordFaith

Current Status Tracker

training setting support:

feature status issue pr
fsdp done
fsdp2 done (Verify by Yuzhen Zhou @ SGLang & AMD) volcengine/verl#1650
megatron done (Dev by Xiang Long @ SGLang & ModelBest, Ziyuan Gao @ Bytedance) volcengine/verl#1602
fp8 need to support

rollout feature support:

feature status issue pr
request-level async rollout done
tool interaction done
VLM geo3k in review (Nan Jiang @ Amazon & Congkai Xie @ Reallm Labs) #137 volcengine/verl#2014
multi-node done by (Shengui Li & Jin Pan @ SGLang)
tool rate limit ETA: May-mid (volcengine team is working on it)
exact colocated rollout ETA: May (Junrong Lin @ SGLang & Qwen is working on it)
server-based rollout developing (Haiquan Chen @ Bytedance & Xibin Wu is working on it) volcengine/verl#1698 volcengine/verl#1769 volcengine/verl#1831
partial rollout ETA: unknown (Yuzhen Zhou @ SGLang & AMD is working on it)

tool support:

feature status issue pr
calc_gsm8k_reward done
sandboxfusion testing (Thanks to Xiaocheng Wang @ Bytedance) volcengine/verl#1525
openhands like pending
android world like ETA: unknown (Congkai Xie @ RealLM Labs is working on it)
bowser-use like ETA: unknown (Bai is working on it)
search done (Thanks to Ling Chang @ USTC & Baidu (Author), Bowen Jin @ UIUC (Advisor)) volcengine/verl#1682

algorithm support

feature status issue pr
GRPO done
PPO pending (Amazon AGI Lab is investigating)
Reinforce++ pending
other (welcome to mention in this thread) TBD

Road Map

Refactor To-dos

Trouble Shooting Tacker

issue status reason pr owner
sglang offloading not work volcengine/verl#1545 pending to verify
Recent reprod fail volcengine/verl#1037 (comment) pending to verify

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions