Skip to content

Conversation

eric-haibin-lin
Copy link
Collaborator

@eric-haibin-lin eric-haibin-lin commented Jul 5, 2025

What does this PR do?

As part of the refactoring efforts, this PR consolidate the actor config in ppo-trainer and ppo-megatron-trainer. The shared configs are in actor/actor.yaml, and both megatron_actor.yaml and dp_actor.yaml inherits from it. If the change is reasonable we can expand it to critic, ref, rollout, data, etc.

Backward compatibility:

  • making ppo_trainer hierarchical means that we rely on hydra to perform config resolution.
  • using hydra with ppo_trainer.yaml or any config inheriting from ppo_trainer.yaml works fine as usually, however
  • directly loading ppo_trainer.yaml with OmegaConf now need to change to the following:
-        cfg = OmegaConf.load("verl/trainer/config/ppo_trainer.yaml")
+        import os
+
+        from hydra import compose, initialize_config_dir
+
+        with initialize_config_dir(config_dir=os.path.abspath("verl/trainer/config")):
+            cfg = compose(config_name="ppo_trainer")

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For backward compatibility, a copy of the original legacy configs are stored in the test folder. A test is added to make sure the loaded new config is equivalent to legacy config. This test, however, should be disabled in the future since we will only maintain and update the new modular config going forward.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

eric-haibin-lin and others added 10 commits July 5, 2025 13:25
- Add TestConfigComparison class with recursive config comparison method
- Test ppo_trainer.yaml matches legacy_ppo_trainer.yaml exactly
- Test ppo_megatron_trainer.yaml matches legacy_ppo_megatron_trainer.yaml exactly
- Use Hydra compose to properly resolve config defaults and interpolations
- Tests pass when configs match and fail with detailed error messages when they differ

Co-Authored-By: H <linhaibin.eric@gmail.com>
- Move TestConfigComparison from test_algo_config_on_cpu.py to test_legacy_config_on_cpu.py
- Create new test_legacy_config_on_cpu.py file with config comparison tests
- Remove TestConfigComparison class from test_algo_config_on_cpu.py
- Both test files continue to work correctly after the move
- Maintains all existing functionality while improving test organization

Co-Authored-By: H <linhaibin.eric@gmail.com>
- Add comprehensive field documentation following format check instructions
- Document all fields in strategy, data_loader_seed, load_weight sections
- Add detailed documentation for checkpoint configuration
- Document all optimizer fields with explanations and options
- Add comprehensive megatron section documentation covering parallelism options
- Document profiling configuration fields
- Follow CI format requirements: comments above fields, blank lines between fields
- Use information from legacy_ppo_megatron_trainer.yaml as reference
- Maintain proper indentation for nested fields

Co-Authored-By: H <linhaibin.eric@gmail.com>
- Attempt to add comprehensive comments above each field
- Working to resolve CI format validation failures
- YAML file loads correctly but format validation still needs work
- Continuing to debug comment placement requirements

Co-Authored-By: H <linhaibin.eric@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR consolidates the actor config, enhancing maintainability. A test was added for backward compatibility. A logic error was identified in the new test file that needs to be addressed.

@eric-haibin-lin eric-haibin-lin changed the title [config] refactor: make actor config more modular [cfg] refactor: make actor config more modular Jul 5, 2025
@vermouth1992
Copy link
Collaborator

Shall we modify other roles as well in this PR?

@eric-haibin-lin
Copy link
Collaborator Author

Shall we modify other roles as well in this PR?

There's already 2k lines of code and I'd prefer we break them up and change one by one incrementally

@eric-haibin-lin eric-haibin-lin enabled auto-merge (squash) July 6, 2025 14:34
@tongyx361 tongyx361 self-assigned this Jul 6, 2025
Copy link
Collaborator

@wuxibin89 wuxibin89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all refactors are done, can ppo_trainer.yaml and ppo_megatron_trainer.yaml be merged into one file?

@eric-haibin-lin
Copy link
Collaborator Author

After all refactors are done, can ppo_trainer.yaml and ppo_megatron_trainer.yaml be merged into one file?

yeah i think that's possible

@eric-haibin-lin eric-haibin-lin merged commit 3f929af into volcengine:main Jul 7, 2025
46 of 47 checks passed
lkc233 pushed a commit to lkc233/verl that referenced this pull request Jul 10, 2025
ArronHZG pushed a commit to imh966/verl that referenced this pull request Jul 10, 2025
oseyosey pushed a commit to oseyosey/verl that referenced this pull request Jul 28, 2025
Juniper1021 pushed a commit to Juniper1021/verl that referenced this pull request Aug 7, 2025
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants