Skip to content

[roadmap] verl Q3 development #2388

@eric-haibin-lin

Description

@eric-haibin-lin

Past roadmap dicusssions for reference: #710 #22

The most important thing for verl Q3 is to make it a modular foundational library for the community to extend, as a starting point but not the destination.

composable model engines

Finish up #1560 such that parallelism strategy is not implemented at the engine level, without exposing details to the worker(role) level. The fsdp/megatron engines are expected to be created and run in a standalone fashion, and be reused across different roles.

  • fsdp actor, critic, ref (focus on fsdp2)
  • megatron actor, critic, ref
  • torchtitan integration (call for contribution)
  • switch all recipe/examples from fsdp1 to fsdp2 by default (and remove ill-maintained ones)

Work in progress interface for comments #1977

rollout workers

  • optimize server mode rollout performance
  • modular rollout workers: VllmRolloutWorker and SGLangRolloutWorker, exposing the same APIs
  • support model with random init weight
  • weight resharding: optimize tp x dp dispatch, and support receiving weight from separate resource groups
  • Agent RL infrastructure [agentic RL] multi-turn rollout and agent loop development tracking #2618

Additional ongoing efforts:

async & disaggregated architecture

multi-turn, data, config infra

streamline new model workflow

  • document the workflow to add a new hf model to verl. Currently with latest vllm there's no need to add weight loader mentioned in https://verl.readthedocs.io/en/latest/advance/fsdp_extension.html
  • better abstraction and registration system for multi-modal models. Currently different multi-modals have inconsistent config attr (e.g. rope), freeze/unfreeze setup, input/output processing... (ideally this should be done at huggingface transformers level but it's not sufficient right now cc @NielsRogge) (RFC needed)
  • verl needs a documentation page about the latest status of model support and per model related features (lora, sequence parallelism, megatron, etc)

high quality recipes and end2end optimizations

  • retool recipe (code is ready, going through reviews)
  • SOTA multimodal vlm RL recipe (call for contribution)
  • enhance DAPO recipe with larger models, and provide scripts with high training throughput (many perf knobs are not turned on in the current script)
  • we welcome more recipes from the community, please open an RFC if you're interested in contributing before opening any PR for recipes [RFC] Verl recipe for image generation model #2136

Additional existing ongoing features:

Many roadmap tasks in this doc are initiated by & credit to @vermouth1992 @SwordFaith

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions