Skip to content

Conversation

yuki-97
Copy link
Contributor

@yuki-97 yuki-97 commented Jul 18, 2025

Add an assertion to avoid using CP+SP (sequence parallel) in DTensor worker.

Related issue: #659.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jul 18, 2025
@yuki-97 yuki-97 requested review from terrykong and joyang-nv July 18, 2025 08:31
@yuki-97 yuki-97 force-pushed the yukih/add-assert branch from b207b9a to b6bb3d2 Compare July 18, 2025 11:51
@yuki-97 yuki-97 changed the title chore: add TP+CP+SP (sequence parallel) assertion in DTensor worker chore: add CP+SP (sequence parallel) assertion in DTensor worker Jul 18, 2025
terrykong
terrykong previously approved these changes Jul 18, 2025
@terrykong terrykong added this pull request to the merge queue Jul 18, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 18, 2025
@terrykong
Copy link
Contributor

test failure, i'll address

terrykong
terrykong previously approved these changes Jul 19, 2025
@yuki-97 yuki-97 force-pushed the yukih/add-assert branch 2 times, most recently from 31b545e to 7ff3466 Compare July 19, 2025 01:14
@terrykong terrykong added this pull request to the merge queue Jul 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 19, 2025
@yuki-97 yuki-97 force-pushed the yukih/add-assert branch 2 times, most recently from c7d3b0a to 8e61bb0 Compare July 19, 2025 03:31
@yuki-97 yuki-97 added the CI:L0 Run doctests and unit tests label Jul 19, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>

fix: change sp+tp assert to warning and fix unit test order (#696)

Signed-off-by: Terry Kong <terryk@nvidia.com>

update assert

Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
@yuki-97 yuki-97 force-pushed the yukih/add-assert branch from 8e61bb0 to dd7cf31 Compare July 19, 2025 03:39
@yuki-97 yuki-97 added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Jul 19, 2025
@terrykong terrykong enabled auto-merge July 19, 2025 04:59
@terrykong terrykong added this pull request to the merge queue Jul 19, 2025
Merged via the queue into main with commit 1337a52 Jul 19, 2025
21 of 23 checks passed
@terrykong terrykong deleted the yukih/add-assert branch July 19, 2025 07:34
SahilJain314 pushed a commit that referenced this pull request Jul 21, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
SahilJain314 pushed a commit that referenced this pull request Jul 21, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
jialei777 pushed a commit to jialei777/nemo-rl that referenced this pull request Jul 23, 2025
…DIA-NeMo#689)

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Jialei Chen <jialeic@google.com>
KiddoZhu pushed a commit that referenced this pull request Jul 28, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
rohitrango pushed a commit that referenced this pull request Jul 29, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jul 30, 2025
…DIA-NeMo#689)

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025
…DIA-NeMo#689)

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
FannYYW pushed a commit to xxman-google/NeMo-RL that referenced this pull request Aug 5, 2025
…DIA-NeMo#689)

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
soodoshll pushed a commit to soodoshll/RL that referenced this pull request Aug 13, 2025
…DIA-NeMo#689)

Signed-off-by: Yuki Huang <yukih@nvidia.com>
Co-authored-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Qidong Su <qidongs@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI:L0 Run doctests and unit tests documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants