Skip to content

Conversation

parthchadha
Copy link
Contributor

… dtensor+tp>1

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

@parthchadha parthchadha added the CI:L0 Run doctests and unit tests label Apr 18, 2025
@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from 6899a41 to 5dce388 Compare April 18, 2025 20:58
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 18, 2025
@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from 5dce388 to 8c95bab Compare April 18, 2025 22:04
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 18, 2025
Copy link
Contributor

@terrykong terrykong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great improvement to UX :)

do you think we can just remove this now since this basically covers all tied cases?

@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from 9300373 to 063492c Compare April 22, 2025 02:59
parthchadha and others added 9 commits April 21, 2025 19:59
… dtensor+tp>1

Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
…ster (#234)

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
…iable (#217)

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from 063492c to 00a11bc Compare April 22, 2025 02:59
@github-actions github-actions bot added documentation Improvements or additions to documentation and removed documentation Improvements or additions to documentation labels Apr 22, 2025
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 22, 2025
…ights

Signed-off-by: Parth Chadha <pchadha@nvidia.com>
@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from f684d01 to 3d6962d Compare April 22, 2025 03:01
@parthchadha parthchadha added the CI:L0 Run doctests and unit tests label Apr 22, 2025
Copy link
Contributor

@SahilJain314 SahilJain314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move the checks to only ValueError on a train step? (and just warn on init?) I could imagine someone just loading a model for validation/inference/evaluation and running into this. Relatedly, a unit test fails here doing just that.

@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 22, 2025
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
@parthchadha parthchadha force-pushed the pchadha/error-tied-weights branch from 4684beb to 3a4c94f Compare April 22, 2025 23:38
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 22, 2025
terrykong
terrykong previously approved these changes Apr 23, 2025
SahilJain314
SahilJain314 previously approved these changes Apr 23, 2025
Copy link
Contributor

@SahilJain314 SahilJain314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@SahilJain314 SahilJain314 enabled auto-merge April 23, 2025 00:08
Signed-off-by: Parth Chadha <pchadha@nvidia.com>
@parthchadha parthchadha dismissed stale reviews from SahilJain314 and terrykong via 6f43e60 April 23, 2025 01:27
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 23, 2025
@parthchadha parthchadha added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Apr 23, 2025
@SahilJain314 SahilJain314 added this pull request to the merge queue Apr 23, 2025
Merged via the queue into main with commit 1788e4c Apr 23, 2025
19 checks passed
@SahilJain314 SahilJain314 deleted the pchadha/error-tied-weights branch April 23, 2025 02:43
ashors1 pushed a commit that referenced this pull request Apr 24, 2025
#229)

Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
Co-authored-by: mckimn <nmckimpson@nvidia.com>
aschilling-nv pushed a commit that referenced this pull request Apr 25, 2025
#229)

Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
Co-authored-by: mckimn <nmckimpson@nvidia.com>
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
terrykong added a commit that referenced this pull request May 1, 2025
commit ebb46c3
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:46 2025 -0700

    fix: fix dtype of empty `token_ids` for consistency (#290)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit cf8f045
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:19 2025 -0700

    chore: Remove outdated comment in DPO config (#293)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 04f30bb
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Wed Apr 30 12:19:47 2025 -0700

    fix: Fixed max seqlen not respected correctly (#299)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit daac5d9
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 29 17:30:05 2025 -0700

    chore: Remove online hf checkpointing (#285)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 3cd8be8
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 29 15:18:37 2025 -0700

    feat: Remove 'last 100' hack for math verifier (#287)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Co-authored-by: Terry Kong <terryk@nvidia.com>

commit 506910a
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 11:29:22 2025 -0700

    test: add a test that checks if recipes can be merged into the base config (#288)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af43261
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 09:18:14 2025 -0700

    chore: add isort rules and pyflakes in ruff/precommit (#291)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8b0837c
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 29 23:57:41 2025 +0800

    ci: add eval functional test (#269)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit 68beb6d
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 23:35:01 2025 -0700

    feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 2f5d22f
Author: Hemil Desai <hemild@nvidia.com>
Date:   Mon Apr 28 16:09:00 2025 -0700

    feat: Add hydra style overrides to SFT (#208)

    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Co-authored-by: ashors1 <ashors@nvidia.com>

commit 8a22c44
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 15:11:03 2025 -0700

    feat: publish convergence/release runs (#214)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af94d43
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 15:02:19 2025 -0700

    fix: fixes #264 where tied weights check didn't work on fsdp1 (#284)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Parth Chadha <parth29@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 1363dba
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 12:44:56 2025 -0700

    fix: improve port selection and exiting early from ray.sub (#272)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 044f385
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Mon Apr 28 14:22:55 2025 -0500

    docs: Correcting build issues and CI (#270)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 0fae6bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 11:08:51 2025 -0700

    feat: Updated Name to NeMo RL (#265)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 34cae3a
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 08:16:51 2025 -0700

    fix: add bibtex entry (#273)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit ee0d2c8
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sat Apr 26 20:15:38 2025 -0700

    docs: instruct users to git clone before beginning (#257)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 09f5416
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 25 13:46:41 2025 -0700

    feat: E2E multi-turn RL example with a sliding puzzle game (#242)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 47e51d3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Fri Apr 25 10:13:59 2025 -0700

    chore: better logging when insufficient resources (#271)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 98473c6
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 22:28:05 2025 -0700

    fix: Update DPO and SFT configs to use dtensor (#256)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 2558444
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 11:02:26 2025 -0700

    fix: Fix fsdp1 grad clipping and log grad norm (#251)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit c8f0a01
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 17:58:43 2025 -0700

    docs: add qwen 32b instruction and add 0.3 planned features (#255)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0a5f31d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 17:49:06 2025 -0400

    fix: fix broken eval script (#253)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2f8a140
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 12:47:18 2025 -0700

    ci: L1 default and increase test time (#252)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 1c7cbd9
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 12:52:13 2025 -0400

    fix: use find_tied_parameters api from HF for tied weight keys (#250)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 1788e4c
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 22 22:05:53 2025 -0400

    fix: raise error if tied weights model is being trained with fsdp1 or… (#229)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: mckimn <nmckimpson@nvidia.com>

commit 1fa4c7a
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 16:38:50 2025 -0700

    fix: Fix indent in dtensor policy (#248)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit ed546ae
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Wed Apr 23 07:29:47 2025 +0800

    feat: streaming each dtensor in refit (#176)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>

commit 5c62657
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Tue Apr 22 14:14:40 2025 -0700

    feat: Importance sampling trick (#174)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit deaece6
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 22 12:39:35 2025 -0700

    feat: Add support for multi-turn generations and RL (tools, games, etc) (#218)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 1245c50
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 12:19:42 2025 -0700

    fix: Speed up DPO functional test (#241)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit af369a3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 22 12:17:03 2025 -0700

    fix: Move ray worker port range start from 20001 to 53001 (#235)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 756152c
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 10:02:34 2025 -0700

    feat: Support multi-epoch training in SFT (#177)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit bbdd671
Author: Anna Shors <ashors@nvidia.com>
Date:   Mon Apr 21 22:16:15 2025 -0700

    feat: DPO (#180)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 88bc0fd
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 21 17:31:23 2025 -0700

    ci: Remove external config from project (#200)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 4a2e126
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 21 17:34:59 2025 -0400

    fix: skip vllm p2p check since its flaky (#238)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 22af21c
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:29 2025 -0700

    feat: FSDP2 SFT (#206)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit e36f488
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:24 2025 -0700

    fix: Fix missing import (#222)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 98b7a90
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 10:06:09 2025 -0700

    docs: update docs everywhere to remove uv pip install which isn't reliable (#217)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit da191b4
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 07:56:55 2025 -0700

    feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8780093
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 18 17:06:54 2025 -0700

    feat: Add total logging of generations in training (#172)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit ce2d121
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Sat Apr 19 00:22:11 2025 +0800

    fix: fix chat_template in eval (#210)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit f8b6ba9
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 12:52:19 2025 -0700

    fix: grpo func test 10 step -> 3 step to speed up CI (#209)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 4a6f62b
Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com>
Date:   Thu Apr 17 11:06:05 2025 -0700

    feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131)

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 78a9834
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 10:03:34 2025 -0700

    fix: ci uses umask (#211)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 5ff10f6
Author: alexchiu <qiuzhaopeng@foxmail.com>
Date:   Thu Apr 17 08:38:45 2025 +0800

    fix: prevent division by zero in ClippedPGLossFn calculation (#166)

    Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 6db2f7a
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Wed Apr 16 15:53:12 2025 -0700

    feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit 62ac8d2
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 15:38:53 2025 -0500

    ci: Only include dependencies in test container (#203)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit b00fcc8
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 16 13:23:40 2025 -0700

    fix: chat template improvements (#148)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit df31f50
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 13:13:58 2025 -0500

    ci: Run tests only in merge queue or when labeled (#159)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>

commit e3af337
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 16 09:23:30 2025 -0700

    feat: Upgrade to vllm v1 runtime (#170)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com>
    Co-authored-by: Anna Shors <ashors@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit dd7c2d7
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 16:00:04 2025 -0700

    fix: unit test script halts on first failure (#189)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 92c3f1d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 15 15:42:01 2025 -0700

    feat: add a unique seed for each vllm llm engine (#171)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2ae8935
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 14:41:21 2025 -0700

    docs: remove backticks from uv.md title (#179)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 9ac4e62
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 12:37:35 2025 -0700

    fix: convert DCP to HF script works without ray cluster (#185)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8213014
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Tue Apr 15 13:55:54 2025 -0500

    docs: Correcting file names (#161)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 4db3167
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 11:07:51 2025 -0700

    fix: default to less verbose logging + uv-venv log once per worker  (#141)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit bda6522
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 22:31:56 2025 -0700

    docs: run tests with --group test to avoid missing test deps (#188)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit c1fc972
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 20:43:51 2025 -0700

    ci: Update to include public/ folder for pages deployment (#182)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit e9812f1
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 14 20:05:46 2025 -0700

    fix: don't use cuda-graphs for vllm generation (#187)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit d7d4cd6
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 15:46:13 2025 -0700

    ci: labels for docs/L0/L1/L2 and run even if only doc test (#181)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0637511
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 15 05:24:07 2025 +0800

    feat: support arbitrary end_strings (#96)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit c99585c
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 14:44:43 2025 -0700

    fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit a5547f2
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 09:18:02 2025 -0700

    docs: Fix doc build warnings and add external CI config (#157)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 32953be
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Fri Apr 11 10:18:03 2025 -0700

    fix: always test vllm (#167)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit c00b8bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Thu Apr 10 22:38:40 2025 -0700

    test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>
terrykong pushed a commit that referenced this pull request May 1, 2025
Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Tech pubs updates to file

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

fix typo

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Incorporated Reviewer Comments in ReadMe

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to file

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech pups updates to resolve some threads

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech pubs updates to resolve some threads

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs minor edits to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Squashed commit of the following:

commit ebb46c3
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:46 2025 -0700

    fix: fix dtype of empty `token_ids` for consistency (#290)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit cf8f045
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:19 2025 -0700

    chore: Remove outdated comment in DPO config (#293)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 04f30bb
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Wed Apr 30 12:19:47 2025 -0700

    fix: Fixed max seqlen not respected correctly (#299)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit daac5d9
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 29 17:30:05 2025 -0700

    chore: Remove online hf checkpointing (#285)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 3cd8be8
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 29 15:18:37 2025 -0700

    feat: Remove 'last 100' hack for math verifier (#287)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Co-authored-by: Terry Kong <terryk@nvidia.com>

commit 506910a
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 11:29:22 2025 -0700

    test: add a test that checks if recipes can be merged into the base config (#288)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af43261
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 09:18:14 2025 -0700

    chore: add isort rules and pyflakes in ruff/precommit (#291)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8b0837c
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 29 23:57:41 2025 +0800

    ci: add eval functional test (#269)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit 68beb6d
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 23:35:01 2025 -0700

    feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 2f5d22f
Author: Hemil Desai <hemild@nvidia.com>
Date:   Mon Apr 28 16:09:00 2025 -0700

    feat: Add hydra style overrides to SFT (#208)

    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Co-authored-by: ashors1 <ashors@nvidia.com>

commit 8a22c44
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 15:11:03 2025 -0700

    feat: publish convergence/release runs (#214)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af94d43
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 15:02:19 2025 -0700

    fix: fixes #264 where tied weights check didn't work on fsdp1 (#284)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Parth Chadha <parth29@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 1363dba
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 12:44:56 2025 -0700

    fix: improve port selection and exiting early from ray.sub (#272)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 044f385
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Mon Apr 28 14:22:55 2025 -0500

    docs: Correcting build issues and CI (#270)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 0fae6bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 11:08:51 2025 -0700

    feat: Updated Name to NeMo RL (#265)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 34cae3a
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 08:16:51 2025 -0700

    fix: add bibtex entry (#273)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit ee0d2c8
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sat Apr 26 20:15:38 2025 -0700

    docs: instruct users to git clone before beginning (#257)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 09f5416
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 25 13:46:41 2025 -0700

    feat: E2E multi-turn RL example with a sliding puzzle game (#242)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 47e51d3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Fri Apr 25 10:13:59 2025 -0700

    chore: better logging when insufficient resources (#271)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 98473c6
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 22:28:05 2025 -0700

    fix: Update DPO and SFT configs to use dtensor (#256)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 2558444
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 11:02:26 2025 -0700

    fix: Fix fsdp1 grad clipping and log grad norm (#251)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit c8f0a01
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 17:58:43 2025 -0700

    docs: add qwen 32b instruction and add 0.3 planned features (#255)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0a5f31d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 17:49:06 2025 -0400

    fix: fix broken eval script (#253)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2f8a140
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 12:47:18 2025 -0700

    ci: L1 default and increase test time (#252)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 1c7cbd9
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 12:52:13 2025 -0400

    fix: use find_tied_parameters api from HF for tied weight keys (#250)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 1788e4c
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 22 22:05:53 2025 -0400

    fix: raise error if tied weights model is being trained with fsdp1 or… (#229)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: mckimn <nmckimpson@nvidia.com>

commit 1fa4c7a
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 16:38:50 2025 -0700

    fix: Fix indent in dtensor policy (#248)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit ed546ae
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Wed Apr 23 07:29:47 2025 +0800

    feat: streaming each dtensor in refit (#176)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>

commit 5c62657
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Tue Apr 22 14:14:40 2025 -0700

    feat: Importance sampling trick (#174)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit deaece6
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 22 12:39:35 2025 -0700

    feat: Add support for multi-turn generations and RL (tools, games, etc) (#218)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 1245c50
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 12:19:42 2025 -0700

    fix: Speed up DPO functional test (#241)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit af369a3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 22 12:17:03 2025 -0700

    fix: Move ray worker port range start from 20001 to 53001 (#235)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 756152c
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 10:02:34 2025 -0700

    feat: Support multi-epoch training in SFT (#177)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit bbdd671
Author: Anna Shors <ashors@nvidia.com>
Date:   Mon Apr 21 22:16:15 2025 -0700

    feat: DPO (#180)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 88bc0fd
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 21 17:31:23 2025 -0700

    ci: Remove external config from project (#200)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 4a2e126
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 21 17:34:59 2025 -0400

    fix: skip vllm p2p check since its flaky (#238)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 22af21c
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:29 2025 -0700

    feat: FSDP2 SFT (#206)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit e36f488
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:24 2025 -0700

    fix: Fix missing import (#222)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 98b7a90
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 10:06:09 2025 -0700

    docs: update docs everywhere to remove uv pip install which isn't reliable (#217)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit da191b4
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 07:56:55 2025 -0700

    feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8780093
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 18 17:06:54 2025 -0700

    feat: Add total logging of generations in training (#172)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit ce2d121
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Sat Apr 19 00:22:11 2025 +0800

    fix: fix chat_template in eval (#210)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit f8b6ba9
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 12:52:19 2025 -0700

    fix: grpo func test 10 step -> 3 step to speed up CI (#209)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 4a6f62b
Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com>
Date:   Thu Apr 17 11:06:05 2025 -0700

    feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131)

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 78a9834
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 10:03:34 2025 -0700

    fix: ci uses umask (#211)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 5ff10f6
Author: alexchiu <qiuzhaopeng@foxmail.com>
Date:   Thu Apr 17 08:38:45 2025 +0800

    fix: prevent division by zero in ClippedPGLossFn calculation (#166)

    Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 6db2f7a
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Wed Apr 16 15:53:12 2025 -0700

    feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit 62ac8d2
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 15:38:53 2025 -0500

    ci: Only include dependencies in test container (#203)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit b00fcc8
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 16 13:23:40 2025 -0700

    fix: chat template improvements (#148)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit df31f50
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 13:13:58 2025 -0500

    ci: Run tests only in merge queue or when labeled (#159)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>

commit e3af337
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 16 09:23:30 2025 -0700

    feat: Upgrade to vllm v1 runtime (#170)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com>
    Co-authored-by: Anna Shors <ashors@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit dd7c2d7
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 16:00:04 2025 -0700

    fix: unit test script halts on first failure (#189)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 92c3f1d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 15 15:42:01 2025 -0700

    feat: add a unique seed for each vllm llm engine (#171)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2ae8935
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 14:41:21 2025 -0700

    docs: remove backticks from uv.md title (#179)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 9ac4e62
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 12:37:35 2025 -0700

    fix: convert DCP to HF script works without ray cluster (#185)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8213014
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Tue Apr 15 13:55:54 2025 -0500

    docs: Correcting file names (#161)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 4db3167
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 11:07:51 2025 -0700

    fix: default to less verbose logging + uv-venv log once per worker  (#141)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit bda6522
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 22:31:56 2025 -0700

    docs: run tests with --group test to avoid missing test deps (#188)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit c1fc972
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 20:43:51 2025 -0700

    ci: Update to include public/ folder for pages deployment (#182)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit e9812f1
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 14 20:05:46 2025 -0700

    fix: don't use cuda-graphs for vllm generation (#187)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit d7d4cd6
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 15:46:13 2025 -0700

    ci: labels for docs/L0/L1/L2 and run even if only doc test (#181)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0637511
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 15 05:24:07 2025 +0800

    feat: support arbitrary end_strings (#96)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit c99585c
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 14:44:43 2025 -0700

    fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit a5547f2
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 09:18:02 2025 -0700

    docs: Fix doc build warnings and add external CI config (#157)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 32953be
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Fri Apr 11 10:18:03 2025 -0700

    fix: always test vllm (#167)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit c00b8bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Thu Apr 10 22:38:40 2025 -0700

    test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>
KiddoZhu pushed a commit that referenced this pull request May 6, 2025
#229)

Signed-off-by: Parth Chadha <pchadha@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
Co-authored-by: mckimn <nmckimpson@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI:L0 Run doctests and unit tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants