Skip to content

🚀 [2025/07/31] Recent Updates Summary for ROLL Project #112

@PanAndy

Description

@PanAndy

Hello everyone!
Thank you for your interest in ROLL.

ROLL has recently introduced a host of new features. Below is a summary of the recent updates. We will continue to iterate and update ROLL, and we welcome you to join the ROLL community.

🚀 New Features

  • Agentic RL

    • Refactored and optimized the Agentic RL design to provide a more powerful and flexible framework.
    • Introduced multi-turn interactive local development and debugging capabilities for Agentic RL, significantly boosting development and debugging efficiency. Example: tests/agentic/env_manager/test_traj_env_manager.py
    • Added Agentic RL async training to improve training efficiency. Example: examples/qwen2.5-0.5B-agentic/agent_val_frozen_lake_async.yaml
  • New Training Capabilities

    • Supported Group Sequence Policy (GSPO) with importance_sampling: Literal["token", "seq"].
    • Introduced Distill Pipeline, providing knowledge distillation capabilities. Path: roll/pipeline/distill/distill_pipeline.py
    • Added VLM Multi-domain RLVR Pipeline, enabling multi-domain joint training for multi-modal models. Path: roll/pipeline/rlvr/rlvr_vlm_pipeline.py
    • New DPO Pipeline. Path: roll/pipeline/dpo/dpo_pipeline.py
    • Supported LoRA training. Example: examples/qwen2.5-7B-rlvr_megatron/rlvr_lora_zero3.yaml
    • Added the latest math test datasets, GPQA-Diamond, and a new MultipleChoiceBoxedRuleRewardWorker.
  • Other Enhancements

    • Improved the functionality for restoring checkpoints from downloaded model paths.
    • Added CLAUD.md documentation.
    • Fixed issues caused by Automap concurrency.
    • Fixed directory issues when saving Critic checkpoints.
    • Resolved vllm_strategy Qwen3 Dense FP8 compatibility issues.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions