You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This normalization is functionally correct, but it overwrites the original config value with a new value value. In my opinion, while first reading Verl code, it creates semantic ambiguity — especially when ppo_mini_batch_size is used in formulas like:
^ One was named explicitly (micro_batch_size_per_gpu), while the other (ppo_mini_batch_size) is in-place modified from global to local.
To make it more clear to contributors, would it be better if we rename it? This won't make any breaking changes to user side, just to make the codebase with more clarity.