Disambiguate normalized ppo_mini_batch_size by renaming to ppo_mini_batch_size_normalized

Given the high-level documentation, ppo_mini_batch_size is defined as a global algorithmic parameter. Inside the trainer it gets normalized:

```python
self.config.actor.ppo_mini_batch_size //= (self.device_mesh.size() // self.ulysses_sequence_parallel_size)
```

This normalization is functionally correct, but it overwrites the original config value with a new value value. In my opinion, while first reading Verl code, it creates semantic ambiguity — especially when ppo_mini_batch_size is used in formulas like:

```python
gradient_accumulation = ppo_mini_batch_size // micro_batch_size_per_gpu
```

^ One was named explicitly (micro_batch_size_per_gpu), while the other (ppo_mini_batch_size) is in-place modified from global to local. 

To make it more clear to contributors, would it be better if we rename it? This won't make any breaking changes to user side, just to make the codebase with more clarity.
```
self.config.actor.ppo_mini_batch_size_normalized = self.config.actor.ppo_mini_batch_size // ... normalize
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disambiguate normalized ppo_mini_batch_size by renaming to ppo_mini_batch_size_normalized #1043

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disambiguate normalized ppo_mini_batch_size by renaming to ppo_mini_batch_size_normalized #1043

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions