Skip to content

Disambiguate normalized ppo_mini_batch_size by renaming to ppo_mini_batch_size_normalized #1043

@Superskyyy

Description

@Superskyyy

Given the high-level documentation, ppo_mini_batch_size is defined as a global algorithmic parameter. Inside the trainer it gets normalized:

self.config.actor.ppo_mini_batch_size //= (self.device_mesh.size() // self.ulysses_sequence_parallel_size)

This normalization is functionally correct, but it overwrites the original config value with a new value value. In my opinion, while first reading Verl code, it creates semantic ambiguity — especially when ppo_mini_batch_size is used in formulas like:

gradient_accumulation = ppo_mini_batch_size // micro_batch_size_per_gpu

^ One was named explicitly (micro_batch_size_per_gpu), while the other (ppo_mini_batch_size) is in-place modified from global to local.

To make it more clear to contributors, would it be better if we rename it? This won't make any breaking changes to user side, just to make the codebase with more clarity.

self.config.actor.ppo_mini_batch_size_normalized = self.config.actor.ppo_mini_batch_size // ... normalize

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions