Skip to content

Conversation

terrykong
Copy link
Contributor

@terrykong terrykong commented Jun 21, 2025

Partially addresses #532

With this change, loops like SFT/DPO that only need torch (and not vllm), should work.

GRPO still needs more work since vllm needs to be manually compiled for aarch64 since they do not publish aarch64 wheels

Perf/convergence don't look to be affected by manually specifying this index, which pulls nvidia lib wheels for 12.8 instead of 12.6 (which appeared to be the default before specifying the index)
image

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong marked this pull request as ready for review June 24, 2025 16:32
@terrykong terrykong added this pull request to the merge queue Jun 24, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 24, 2025
@terrykong terrykong added this pull request to the merge queue Jun 24, 2025
Merged via the queue into main with commit 3c75fd0 Jun 25, 2025
19 of 20 checks passed
@terrykong terrykong deleted the tk/explicit-torch-index branch June 25, 2025 05:11
xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jun 25, 2025
…NVIDIA-NeMo#533)

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Xuehan Xiong <xxman@google.com>
@pramodk
Copy link

pramodk commented Jul 4, 2025

GRPO still needs more work since vllm needs to be manually compiled for aarch64 since they do not publish aarch64 wheels

Hello @terrykong! To support GRPO on GH, is the vllm's aarch64 wheel is the only blocker here?

Just want to understand if it's only wheel availability or there is something more. We want to test this internally (nvidia) and hence want to clarify.

therealnaveenkamal pushed a commit to therealnaveenkamal/RL that referenced this pull request Jul 7, 2025
YzjiaoNvd pushed a commit to YzjiaoNvd/NeMo-RL that referenced this pull request Jul 14, 2025
KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants