Skip to content

Conversation

yfw
Copy link
Contributor

@yfw yfw commented Jul 7, 2025

What does this PR do ?

Fixes below crash. I believe this is happening when the prompt length exceeds max sequence length

Traceback (most recent call last):
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/examples/run_grpo_math.py", line 335, in <module>
    main()
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/examples/run_grpo_math.py", line 318, in main
    grpo_train(
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/nemo_rl/algorithms/grpo.py", line 777, in grpo_train
    logger.log_plot_token_mult_prob_error(
  File "/lustre/fs1/portfolios/coreai/users/yifuw/code/tmp/real_ep_merge/RL/nemo_rl/utils/logger.py", line 770, in log_plot_token_mult_prob_error
    max_abs_error_idx = torch.argmax(diff_i).item()
                        ^^^^^^^^^^^^^^^^^^^^
IndexError: argmax(): Expected reduction dim to be specified for input.numel() == 0.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

Reproduced crash by hardcoding logprob error threshold to 1.0 and setting policy.max_total_sequence_length=64: https://wandb.ai/nvidia/grpo-dev-yifu/runs/tcv42xq1/logs

With fix: https://wandb.ai/nvidia/grpo-dev-yifu/runs/i1krxjes/logs

Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
@yfw yfw requested a review from ZhiyuLi-Nvidia July 7, 2025 23:48
@yfw yfw changed the title Fix crash for logprob error plot fix: Fix crash for logprob error plot Jul 7, 2025
@yfw yfw requested a review from ashors1 July 7, 2025 23:53
@ZhiyuLi-Nvidia
Copy link
Contributor

ZhiyuLi-Nvidia commented Jul 8, 2025

LGTM now to unblock your experiments. Plan to look back to why generation_start_idx >= generation_end_idx later.

@terrykong terrykong enabled auto-merge July 8, 2025 18:01
@terrykong terrykong added this pull request to the merge queue Jul 8, 2025
Merged via the queue into main with commit 2fdb9ea Jul 8, 2025
15 of 18 checks passed
@terrykong terrykong deleted the yifu/fix_logprob_plot branch July 8, 2025 22:11
jialei777 pushed a commit to jialei777/nemo-rl that referenced this pull request Jul 23, 2025
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Jialei Chen <jialeic@google.com>
KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants