fix: Fix crash for logprob error plot #623

yfw · 2025-07-07T23:48:23Z

What does this PR do ?

Fixes below crash. I believe this is happening when the prompt length exceeds max sequence length

Traceback (most recent call last):
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/examples/run_grpo_math.py", line 335, in <module>
    main()
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/examples/run_grpo_math.py", line 318, in main
    grpo_train(
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/nemo_rl/algorithms/grpo.py", line 777, in grpo_train
    logger.log_plot_token_mult_prob_error(
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/nemo_rl/utils/logger.py", line 770, in log_plot_token_mult_prob_error
    max_abs_error_idx = torch.argmax(diff_i).item()
                        ^^^^^^^^^^^^^^^^^^^^
IndexError: argmax(): Expected reduction dim to be specified for input.numel() == 0.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

Reproduced crash by hardcoding logprob error threshold to 1.0 and setting policy.max_total_sequence_length=64: https://wandb.ai/nvidia/grpo-dev-yifu/runs/tcv42xq1/logs

With fix: https://wandb.ai/nvidia/grpo-dev-yifu/runs/i1krxjes/logs

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

nemo_rl/utils/logger.py

ZhiyuLi-Nvidia · 2025-07-08T00:37:24Z

LGTM now to unblock your experiments. Plan to look back to why generation_start_idx >= generation_end_idx later.

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Signed-off-by: Jialei Chen <jialeic@google.com>

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

Fix crash for logprob error plot

9183e45

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

yfw requested a review from ZhiyuLi-Nvidia July 7, 2025 23:48

yfw changed the title ~~Fix crash for logprob error plot~~ fix: Fix crash for logprob error plot Jul 7, 2025

yfw requested a review from ashors1 July 7, 2025 23:53

terrykong reviewed Jul 8, 2025

View reviewed changes

nemo_rl/utils/logger.py Show resolved Hide resolved

terrykong enabled auto-merge July 8, 2025 18:01

terrykong approved these changes Jul 8, 2025

View reviewed changes

terrykong added this pull request to the merge queue Jul 8, 2025

Merged via the queue into main with commit 2fdb9ea Jul 8, 2025
15 of 18 checks passed

terrykong deleted the yifu/fix_logprob_plot branch July 8, 2025 22:11

jialei777 pushed a commit to jialei777/nemo-rl that referenced this pull request Jul 23, 2025

fix: Fix crash for logprob error plot (NVIDIA-NeMo#623)

bb614d4

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Signed-off-by: Jialei Chen <jialeic@google.com>

KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025

fix: Fix crash for logprob error plot (#623)

a6907cd

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025

fix: Fix crash for logprob error plot (NVIDIA-NeMo#623)

0787eec

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Fix crash for logprob error plot #623

fix: Fix crash for logprob error plot #623

Uh oh!

yfw commented Jul 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

ZhiyuLi-Nvidia commented Jul 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

fix: Fix crash for logprob error plot #623

fix: Fix crash for logprob error plot #623

Uh oh!

Conversation

yfw commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

Uh oh!

ZhiyuLi-Nvidia commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yfw commented Jul 7, 2025 •

edited

Loading

ZhiyuLi-Nvidia commented Jul 8, 2025 •

edited

Loading