add missing entry dynamic_batching and setting it to False #442

LeonMalteW · 2025-05-27T08:46:07Z

What does this PR do ?

add the missing config entry

Issues

List issues that this PR closes:
I saw no issue for this, but if I tried to run the GRPO on DeepScaler with standard config it couldn't run

[yes] Make sure you read and followed Contributor guidelines
[no] Did you write any new necessary tests?
[no] Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
[no ] Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

terrykong

Thanks for fixing! Could you please signoff your commit?

https://github.com/NVIDIA/NeMo-RL/blob/main/CONTRIBUTING.md#signing-your-work

Retroactively, you can rebase to do it and follow it up with a force push:

git rebase HEAD~1 --signoff
git push --force-with-lease origin main  # note this is your fork's main since that's the source branch

SahilJain314 · 2025-05-27T17:45:51Z

@terrykong I thought we were originally deriving this config from the common grpo config. Guess that changed after a while. Should we update this one to enable dynamic batching like the common one?

terrykong · 2025-05-27T17:57:29Z

@SahilJain314 good point. we should go back to depending on the common.

i think for this case, @abukharin-nv didn't use dynamic batching in his recipes, so I think this PR is faithful to the original experiment. a follow up PR can enable dynamic batching just to make convergence is still good

Signed-off-by: lw86ruwo <Leon@wenderoth.de>

terrykong · 2025-05-28T16:15:19Z

Hi @LeonMalteW . If you're not able to resolve the DCO by this afternoon, I will create a new PR (and give you credit for the contribution) and merge this since this bug should be fixed ASAP

LeonMalteW · 2025-05-29T15:28:45Z

By the way, I'm still trying to reproduce the group-deepscaler run.

At first, I thought there was only one thing wrong with the configuration.
But after the fix, I could only run the First Stage successfully.

When I ran the second stage with
max_total_sequence_length set to 16384
it led to lots of VRAM problems.

The only run I could perform was by reducing the GRPO batch size to 2 and adjusting the other configurations accordingly.

This would obviously lead to an increase in training time of 20x or more.
That's why I'm asking if there is a functioning configuration out there?

Maybe fixing the config was not the right approach, and there's something else that's wrong.

This is my "working config"

# GRPO Algorithm Configuration
defaults: "grpo-deepscaler-1.5b-8K.yaml"

grpo:
  num_prompts_per_step: 2 # orignial 128

loss_fn:
  reference_policy_kl_penalty: 0.001
  ratio_clip_max: 0.28


policy:
  max_total_sequence_length: 16384

  train_global_batch_size: 16 # orignial 64
  generation_batch_size: 16 # orignial 32
  logprob_batch_size: 1 # orignial 4

terrykong · 2025-05-29T16:47:30Z

@LeonMalteW do you mind opening a new issue for your OOM and share your hardware requirements? It'll help us triage.

I'm going to close this PR in favor of #455 (I've given you credit for the contribution)

add missing entry dynamic_batching and setting it to False

662b592

terrykong requested changes May 27, 2025

View reviewed changes

LeonMalteW and others added 3 commits May 28, 2025 16:13

Merge branch 'main' into main

4bf75a8

add missing entry dynamic_batching and setting it to False

0e489da

Signed-off-by: lw86ruwo <Leon@wenderoth.de>

Merge remote-tracking branch 'origin/main'

cd2a814

terrykong closed this May 29, 2025

LeonMalteW mentioned this pull request May 29, 2025

OOM when trying to reproduce the grpo-deepscaler run #456

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add missing entry dynamic_batching and setting it to False #442

add missing entry dynamic_batching and setting it to False #442

Uh oh!

LeonMalteW commented May 27, 2025

Uh oh!

terrykong left a comment •

edited

Loading

Uh oh!

SahilJain314 commented May 27, 2025

Uh oh!

terrykong commented May 27, 2025

Uh oh!

terrykong commented May 28, 2025 •

edited

Loading

Uh oh!

LeonMalteW commented May 29, 2025

Uh oh!

terrykong commented May 29, 2025

Uh oh!

Uh oh!

add missing entry dynamic_batching and setting it to False #442

add missing entry dynamic_batching and setting it to False #442

Uh oh!

Conversation

LeonMalteW commented May 27, 2025

What does this PR do ?

Issues

Uh oh!

terrykong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SahilJain314 commented May 27, 2025

Uh oh!

terrykong commented May 27, 2025

Uh oh!

terrykong commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LeonMalteW commented May 29, 2025

Uh oh!

terrykong commented May 29, 2025

Uh oh!

Uh oh!

terrykong left a comment •

edited

Loading

terrykong commented May 28, 2025 •

edited

Loading