Skip to content

Conversation

ashors1
Copy link
Contributor

@ashors1 ashors1 commented Jul 17, 2025

What does this PR do ?

  • decreases gpu_memory_utilization for Llama 8B and 70B configs to avoid OOM
  • updates Qwen3 30B-A3B parallelism for a 30% speedup in step time

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Signed-off-by: ashors1 <ashors@nvidia.com>
@terrykong terrykong changed the title fix: Megatron config updates fix: Megatron config updates to avoid OOM Jul 18, 2025
@terrykong terrykong added this pull request to the merge queue Jul 18, 2025
Merged via the queue into main with commit 00f930a Jul 18, 2025
15 of 16 checks passed
@terrykong terrykong deleted the ashors/update-mcore-configs branch July 18, 2025 20:29
SahilJain314 pushed a commit that referenced this pull request Jul 21, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
SahilJain314 pushed a commit that referenced this pull request Jul 21, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
jialei777 pushed a commit to jialei777/nemo-rl that referenced this pull request Jul 23, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Jialei Chen <jialeic@google.com>
KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jul 30, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
soodoshll pushed a commit to soodoshll/RL that referenced this pull request Aug 13, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Qidong Su <qidongs@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants