docs: Post-Merge cleanup of the deploy documentation #1922

atchernych · 2025-07-14T21:22:31Z

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: DYN-591

Summary by CodeRabbit

Documentation
- Condensed and restructured deployment guides for improved clarity and brevity.
- Removed detailed step-by-step and example deployment instructions, replacing them with references to installation and example resources.
- Updated quickstart and example guides to clarify installation options and streamline environment setup.
- Deleted the manual Helm deployment guide and consolidated example references.
- Improved formatting and corrected minor typos in example documentation.

coderabbitai · 2025-07-14T21:25:02Z

Walkthrough

The changes remove or condense detailed deployment documentation throughout the project. This includes deleting step-by-step guides, example deployments, and references to specific deployment commands or scripts. The remaining documentation now points users to high-level guides and external resources, streamlining instructions and focusing on directing users to core installation and usage materials.

Changes

File(s)	Change Summary
deploy/README.md	Removed simplified deployment process references, operator guide link, and entire "Example Deployments" section.
docs/examples/llm_deployment.md	Clarified headings, fixed typos, removed Kubernetes deployment section, and made minor formatting improvements.
docs/guides/dynamo_deploy/README.md	Condensed and restructured deployment guidance; removed detailed procedural content and focused on external docs.
docs/guides/dynamo_deploy/dynamo_cloud.md	Simplified and shortened; removed detailed build and deployment instructions, referencing Quickstart Guide.
docs/guides/dynamo_deploy/manual_helm_deployment.md	Deleted comprehensive manual Helm deployment guide for Kubernetes.
docs/guides/dynamo_deploy/quickstart.md	Clarified install options, restructured example exploration, and improved flow for environment setup.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Documentation

    User->>Documentation: Access deployment instructions
    Documentation-->>User: Directs to high-level guides and external resources
    Note over Documentation: Detailed step-by-step deployment and examples are omitted

Possibly related PRs

feat: remove dynamo deployment from cli #1742: Removes the actual Dynamo deployment CLI code and Kubernetes deployment manager class, directly related to this PR's removal of deployment documentation and references.
feat: add crds for vllm and llm examples #1766: Adds new Kubernetes CRD YAML files for LLM and vLLM deployments, related as this PR removes LLM deployment documentation while the other adds deployment manifests.

Poem

In burrows deep, the docs were trimmed,
No more long guides or scripts to skim.
Now just a hop to guides anew,
With quickstart paths to follow through.
Deployment tales, concise and neat—
The rabbit’s work is now complete!
🐇✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

docs/examples/llm_deployment.md (1)

95-118: Remove duplicated explanatory block

The paragraph from lines 97-107 is repeated verbatim at 108-118, making the Note twice as long and tripping readers.

  Building a vLLM docker image for ARM machines currently involves building vLLM from source, which is known to have performance issues to require extensive system RAM; see [vLLM Issue 8878](https://github.com/vllm-project/vllm/issues/8878).
@@
-When vLLM has pre-built ARM wheels published, this process can be improved.
-
-You can tune the number of parallel build jobs for building VLLM from source
-on ARM based on your available cores and system RAM with `VLLM_MAX_JOBS`.
-
-For example, on an ARM machine with low system resources:
-`./container/build.sh --framework vllm --platform linux/arm64 --build-arg VLLM_MAX_JOBS=2`
-
-For example, on a GB200 which has very high CPU cores and memory resource:
-`./container/build.sh --framework vllm --platform linux/arm64 --build-arg VLLM_MAX_JOBS=64`
-
-When vLLM has pre-built ARM wheels published, this process can be improved.
+When vLLM ships pre-built ARM wheels, this process will simplify.

🧹 Nitpick comments (7)

docs/guides/dynamo_deploy/quickstart.md (3)
4-4: Prefer “can install” over “could install” for clarity

“Could” reads like a hypothetical. “Can install” (or “You can install”) is the common idiom for instructions.
-You could install from [Published Artifacts](#1-installing-from-published-artifacts) or [Source](#2-installing-from-source)
+You can install from [Published Artifacts](#1-installing-from-published-artifacts) or [Source](#2-installing-from-source)
139-151: Streamline the destination-selection snippet

Capitalise “Kubernetes” for consistency.

Consider using a list instead of free-form prose to make the two alternatives visually parallel and easier to scan.

The current layout omits language identifiers on the second code block – stick with ```bash for both.
-Pick your deployment destination.
-
-If local
-
-```bash
-export DYNAMO_CLOUD=http://localhost:8080
-```
-
-If kubernetes
-```bash
-export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com
-```
+Pick your deployment destination:
+
+- **Local**
+  ```bash
+  export DYNAMO_CLOUD=http://localhost:8080
+  ```
+
+- **Kubernetes**
+  ```bash
+  export DYNAMO_CLOUD=https://dynamo-cloud.nvidia.com
+  ```
161-168: Fix grammar & tighten cross-references

• “Example are stored” → “Examples are stored”.
• Links already imply folder and guide; no need for extra words.
-Example are stored in the [Examples Folder](../../examples/README.md)
-
-Follow the most basic ones in the dedicated guides below:
-
-[Hello World](../../examples/hello_world.md)
-
-[LLM](../../examples/llm_deployment.md)
+Examples live in the [examples directory](../../examples/README.md).
+
+Quick-start guides:
+- [Hello World](../../examples/hello_world.md)
+- [LLM](../../examples/llm_deployment.md)
docs/examples/llm_deployment.md (1)

150-152: Minor casing correction

“Planner” sentence is great; just lower-case the flag to match CLI conventions (--planner.no-operation=false) unless the binary really expects uppercase P.
docs/guides/dynamo_deploy/README.md (2)
18-24: Tighten wording and fix “Prior to”

Replace “Prior to” with “Before” (Wordiness). Also, sentence 20-24 is long; splitting improves readability.
-We expect users to deploy their inference graphs using CRDs or helm charts.
-
-Prior to deploying an inference graph the user should deploy the Dynamo Cloud Platform.
-Dynamo Cloud acts as an orchestration layer between the end user and Kubernetes, handling the complexity of deploying your graphs for you. This is a one-time action, only necessary the first time you deploy a DynamoGraph.
+We expect users to deploy inference graphs via CRDs or Helm charts.
+
+Before deploying a graph, install the Dynamo Cloud Platform.  
+Dynamo Cloud is an orchestration layer that hides Kubernetes complexity and only needs to be installed once for a given cluster.
26-30: Avoid heading numbers inside Markdown heading text

Markdown headings with “# 1.” & “# 2.” render oddly and hurt anchor stability. Prefer plain headings or an ordered list.
-# 1. Please follow [Installing Dynamo Cloud](./dynamo_cloud.md) for steps to install.
-For details about the Dynamo Cloud Platform, see the [Dynamo Operator Guide](dynamo_operator.md)
-
-# 2. Follow [Examples](../../examples/README.md) to see how you can deploy your Inference Graphs.
+## 1  Install Dynamo Cloud  
+See [Installing Dynamo Cloud](./dynamo_cloud.md). For operator details, consult the [Dynamo Operator Guide](dynamo_operator.md).
+
+## 2  Deploy an inference graph  
+Browse the [examples](../../examples/README.md) to get started.
docs/guides/dynamo_deploy/dynamo_cloud.md (1)

56-66: Unclear one-time image instructions

The block mixes “pre-built image” vs “build & push your own” but uses the same variable (DYNAMO_IMAGE) later. Consider explicitly stating which path readers should choose and when to set DYNAMO_IMAGE.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e0cb07 and 9c2015d.

📒 Files selected for processing (6)

deploy/README.md (0 hunks)
docs/examples/llm_deployment.md (4 hunks)
docs/guides/dynamo_deploy/README.md (1 hunks)
docs/guides/dynamo_deploy/dynamo_cloud.md (3 hunks)
docs/guides/dynamo_deploy/manual_helm_deployment.md (0 hunks)
docs/guides/dynamo_deploy/quickstart.md (3 hunks)

💤 Files with no reviewable changes (2)

deploy/README.md
docs/guides/dynamo_deploy/manual_helm_deployment.md

🧰 Additional context used

🧠 Learnings (2)

docs/guides/dynamo_deploy/quickstart.md (1)

Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1409
File: examples/router_standalone/worker.py:171-186
Timestamp: 2025-06-08T08:30:45.126Z
Learning: Example code in the `examples/` directory may intentionally use hard-coded values or simplified implementations that wouldn't be appropriate for production code, but are acceptable for demonstration and testing purposes.

docs/guides/dynamo_deploy/dynamo_cloud.md (3)

Learnt from: biswapanda
PR: ai-dynamo/dynamo#1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

Learnt from: julienmancuso
PR: ai-dynamo/dynamo#1365
File: deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go:171-178
Timestamp: 2025-06-04T13:09:53.416Z
Learning: The `DYN_DEPLOYMENT_CONFIG` environment variable (commonconsts.DynamoDeploymentConfigEnvVar) in the Dynamo operator will never be set via ValueFrom (secrets/config maps), only via direct Value assignment. The GetDynamoDeploymentConfig method correctly only checks env.Value for this specific environment variable.

Learnt from: julienmancuso
PR: ai-dynamo/dynamo#1474
File: deploy/cloud/operator/internal/controller/dynamocomponent_controller.go:1302-1306
Timestamp: 2025-06-11T21:18:00.425Z
Learning: In the Dynamo operator, the project’s preferred security posture is to set a Pod-level `PodSecurityContext` with `runAsUser`, `runAsGroup`, and `fsGroup` all set to `1000`, and then selectively override the user at the individual container level (e.g., `RunAsUser: 0` for Kaniko) when root is required.

🪛 LanguageTool

docs/guides/dynamo_deploy/README.md

[style] ~21-~21: ‘Prior to’ might be wordy. Consider a shorter alternative.
Context: ...ence graphs using CRDs or helm charts. Prior to deploying an inference graph the user s...

(EN_WORDINESS_PREMIUM_PRIOR_TO)

🪛 markdownlint-cli2 (0.17.2)

docs/guides/dynamo_deploy/dynamo_cloud.md

69-69: Multiple headings with the same content

(MD024, no-duplicate-heading)

🔇 Additional comments (2)

docs/examples/llm_deployment.md (2)

84-92: Title change looks good

The new heading accurately reflects the step.

119-123: Heading tweak approved

The revised heading clarifies intent without altering meaning.

docs/guides/dynamo_deploy/dynamo_cloud.md

deploy/README.md

docs/examples/llm_deployment.md

Signed-off-by: atchernych <atchernych@nvidia.com>

commit d4b5414 Author: atchernych <atchernych@nvidia.com> Date: Mon Jul 21 13:10:24 2025 -0700 fix: mypy error (#2029) commit 79337c7 Author: Ryan McCormick <rmccormick@nvidia.com> Date: Mon Jul 21 12:12:16 2025 -0700 build: support custom TRTLLM build for commits not on main branch (#2021) commit 95dd942 Author: atchernych <atchernych@nvidia.com> Date: Mon Jul 21 12:09:33 2025 -0700 docs: Post-Merge cleanup of the deploy documentation (#1922) commit cb6de94 Author: ptarasiewiczNV <104908264+ptarasiewiczNV@users.noreply.github.com> Date: Sun Jul 20 22:34:50 2025 +0200 chore: Install vLLM and WideEP kernels in vLLM runtime container (#2010) Signed-off-by: Alec <35311602+alec-flowers@users.noreply.github.com> Co-authored-by: Alec <35311602+alec-flowers@users.noreply.github.com> Co-authored-by: alec-flowers <aflowers@nvidia.com> commit fe63c17 Author: Alec <35311602+alec-flowers@users.noreply.github.com> Date: Fri Jul 18 17:45:08 2025 -0700 fix: Revert "feat: add vLLM v1 multi-modal example. Add llama4 Maverick ex… (#2017) commit bf1998f Author: jthomson04 <jwillthomson19@gmail.com> Date: Fri Jul 18 17:23:50 2025 -0700 fix: Don't detokenize twice in TRT-LLM examples (#1955) commit 343a481 Author: Ryan Olson <ryanolson@users.noreply.github.com> Date: Fri Jul 18 16:22:43 2025 -0600 feat: http disconnects (#2014) commit e330d96 Author: Yan Ru Pei <yanrpei@gmail.com> Date: Fri Jul 18 13:40:54 2025 -0700 feat: enable / disable chunked prefill for mockers (#2015) Signed-off-by: Yan Ru Pei <yanrpei@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> commit 353146e Author: GuanLuo <41310872+GuanLuo@users.noreply.github.com> Date: Fri Jul 18 13:33:36 2025 -0700 feat: add vLLM v1 multi-modal example. Add llama4 Maverick example (#1990) Signed-off-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com> Co-authored-by: krishung5 <krish@nvidia.com> commit 1f07dab Author: Jacky <18255193+kthui@users.noreply.github.com> Date: Fri Jul 18 13:04:20 2025 -0700 feat: Add migration to LLM requests (#1930) commit 5f17918 Author: Tanmay Verma <tanmayv@nvidia.com> Date: Fri Jul 18 12:59:34 2025 -0700 refactor: Migrate to new UX2 for python launch (#2003) commit fc12436 Author: Graham King <grahamk@nvidia.com> Date: Fri Jul 18 14:52:57 2025 -0400 feat(frontend): router-mode settings (#2001) commit dc75cf1 Author: ptarasiewiczNV <104908264+ptarasiewiczNV@users.noreply.github.com> Date: Fri Jul 18 18:47:28 2025 +0200 chore: Move NIXL repo clone to Dockerfiles (#2009) commit f6f392c Author: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com> Date: Thu Jul 17 18:44:17 2025 -0700 Remove link to the fix for disagg + eagle3 for TRT-LLM example (#2006) Signed-off-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com> commit cc90ca6 Author: atchernych <atchernych@nvidia.com> Date: Thu Jul 17 18:34:40 2025 -0700 feat: Create a convenience script to uninstall Dynamo Deploy CRDs (#1933) commit 267b422 Author: Greg Clark <grclark@nvidia.com> Date: Thu Jul 17 20:44:21 2025 -0400 chore: loosed python requirement versions (#1998) Signed-off-by: Greg Clark <grclark@nvidia.com> commit b8474e5 Author: ishandhanani <82981111+ishandhanani@users.noreply.github.com> Date: Thu Jul 17 16:35:05 2025 -0700 chore: update cmake and gap installation and sgl in wideep container (#1991) commit 157a3b0 Author: Biswa Panda <biswa.panda@gmail.com> Date: Thu Jul 17 15:38:12 2025 -0700 fix: incorrect helm upgrade command (#2000) commit 0dfca2c Author: Ryan McCormick <rmccormick@nvidia.com> Date: Thu Jul 17 15:33:33 2025 -0700 ci: Update trtllm gitlab triggers for new components directory and test script (#1992) commit f3fb09e Author: Kris Hung <krish@nvidia.com> Date: Thu Jul 17 14:59:59 2025 -0700 fix: Fix syntax for tokio-console (#1997) commit dacffb8 Author: Biswa Panda <biswa.panda@gmail.com> Date: Thu Jul 17 14:57:10 2025 -0700 fix: use non-dev golang image for operator (#1993) commit 2b29a0a Author: zaristei <zaristei@berkeley.edu> Date: Thu Jul 17 13:10:42 2025 -0700 fix: Working Arm Build Dockerfile for Vllm_v1 (#1844) commit 2430d89 Author: Ryan McCormick <rmccormick@nvidia.com> Date: Thu Jul 17 12:57:46 2025 -0700 test: Add trtllm kv router tests (#1988) commit 1eadc01 Author: Graham King <grahamk@nvidia.com> Date: Thu Jul 17 15:07:41 2025 -0400 feat(runtime): Support tokio-console (#1986) commit b62e633 Author: GuanLuo <41310872+GuanLuo@users.noreply.github.com> Date: Thu Jul 17 11:16:28 2025 -0700 feat: support separate chat_template.jinja file (#1853) commit 8ae3719 Author: Hongkuan Zhou <tedzhouhk@gmail.com> Date: Thu Jul 17 11:12:35 2025 -0700 chore: add some details to dynamo deploy quickstart and fix deploy.sh (#1978) Signed-off-by: Hongkuan Zhou <tedzhouhk@gmail.com> Co-authored-by: julienmancuso <161955438+julienmancuso@users.noreply.github.com> commit 08891ff Author: Ryan McCormick <rmccormick@nvidia.com> Date: Thu Jul 17 10:57:42 2025 -0700 fix: Update trtllm tests to use new scripts instead of dynamo serve (#1979) commit 49b7a0d Author: Ryan Olson <ryanolson@users.noreply.github.com> Date: Thu Jul 17 08:35:04 2025 -0600 feat: record + analyze logprobs (#1957) commit 6d2be14 Author: Biswa Panda <biswa.panda@gmail.com> Date: Thu Jul 17 00:17:58 2025 -0700 refactor: replace vllm with vllm_v1 container (#1953) Co-authored-by: alec-flowers <aflowers@nvidia.com> commit 4d2a31a Author: ishandhanani <82981111+ishandhanani@users.noreply.github.com> Date: Wed Jul 16 18:04:09 2025 -0700 chore: add port reservation to utils (#1980) commit 1e3e4a0 Author: Alec <35311602+alec-flowers@users.noreply.github.com> Date: Wed Jul 16 15:54:04 2025 -0700 fix: port race condition through deterministic ports (#1937) commit 4ad281f Author: Tanmay Verma <tanmayv@nvidia.com> Date: Wed Jul 16 14:33:51 2025 -0700 refactor: Move TRTLLM example to the component/backends (#1976) commit 57d24a1 Author: Misha Chornyi <99709299+mc-nv@users.noreply.github.com> Date: Wed Jul 16 14:10:24 2025 -0700 build: Removing shell configuration violations. It's bad practice to hardcod… (#1973) commit 182d3b5 Author: Graham King <grahamk@nvidia.com> Date: Wed Jul 16 16:12:40 2025 -0400 chore(bindings): Remove mistralrs / llama.cpp (#1970) commit def6eaa Author: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com> Date: Wed Jul 16 15:50:23 2025 -0400 feat: attributions for debian deps of sglang, trtllm, vllm runtime containers (#1971) commit f31732a Author: Yan Ru Pei <yanrpei@gmail.com> Date: Wed Jul 16 11:22:15 2025 -0700 feat: integrate mocker with dynamo-run and python cli (#1927) commit aba6099 Author: Graham King <grahamk@nvidia.com> Date: Wed Jul 16 12:26:32 2025 -0400 perf(router): Remove lock from router hot path (#1963) commit b212103 Author: Hongkuan Zhou <tedzhouhk@gmail.com> Date: Wed Jul 16 08:55:33 2025 -0700 docs: add notes in docs to deprecate local connector (#1959) commit 7b325ee Author: Biswa Panda <biswa.panda@gmail.com> Date: Tue Jul 15 18:52:00 2025 -0700 fix: vllm router examples (#1942) commit a50be1a Author: hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Date: Tue Jul 15 17:58:01 2025 -0700 feat: update CODEOWNERS (#1926) commit e260fdf Author: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com> Date: Tue Jul 15 18:49:21 2025 -0400 feat: add bitnami helm chart attribution (#1943) Signed-off-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> commit 1c03404 Author: Biswa Panda <biswa.panda@gmail.com> Date: Tue Jul 15 14:26:24 2025 -0700 fix: update inference gateway deployment instructions (#1940) commit 5ca570f Author: Graham King <grahamk@nvidia.com> Date: Tue Jul 15 16:54:03 2025 -0400 chore: Rename dynamo.ingress to dynamo.frontend (#1944) commit 7b9182f Author: Graham King <grahamk@nvidia.com> Date: Tue Jul 15 16:33:07 2025 -0400 chore: Move examples/cli to lib/bindings/examples/cli (#1952) commit 40d40dd Author: Graham King <grahamk@nvidia.com> Date: Tue Jul 15 16:02:19 2025 -0400 chore(multi-modal): Rename frontend.py to web.py (#1951) commit a9e0891 Author: Ryan Olson <ryanolson@users.noreply.github.com> Date: Tue Jul 15 12:30:30 2025 -0600 feat: adding http clients and recorded response stream (#1919) commit 4128d58 Author: Biswa Panda <biswa.panda@gmail.com> Date: Tue Jul 15 10:30:47 2025 -0700 feat: allow helm upgrade using deploy script (#1936) commit 4da078b Author: Graham King <grahamk@nvidia.com> Date: Tue Jul 15 12:57:38 2025 -0400 fix: Remove OpenSSL dependency, use Rust TLS (#1945) commit fc004d4 Author: jthomson04 <jwillthomson19@gmail.com> Date: Tue Jul 15 08:45:42 2025 -0700 fix: Fix TRT-LLM container build when using a custom pip wheel (#1825) commit 3c6fc6f Author: ishandhanani <82981111+ishandhanani@users.noreply.github.com> Date: Mon Jul 14 22:35:20 2025 -0700 chore: fix typo (#1938) commit de7fe38 Author: Alec <35311602+alec-flowers@users.noreply.github.com> Date: Mon Jul 14 21:47:12 2025 -0700 feat: add vllm e2e integration tests (#1935) commit 860f3f7 Author: Keiven C <213854356+keivenchang@users.noreply.github.com> Date: Mon Jul 14 21:44:19 2025 -0700 chore: metrics endpoint variables renamed from HTTP_SERVER->SYSTEM (#1934) Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com> commit fc402a3 Author: Biswa Panda <biswa.panda@gmail.com> Date: Mon Jul 14 21:21:20 2025 -0700 feat: configurable namespace for vllm v1 example (#1909) commit df40d2c Author: ZichengMa <zichengma1225@gmail.com> Date: Mon Jul 14 21:11:29 2025 -0700 docs: fix typo and add mount-workspace to vllm doc (#1931) Signed-off-by: ZichengMa <zichengma1225@gmail.com> Co-authored-by: Alec <35311602+alec-flowers@users.noreply.github.com> commit 901715b Author: Tanmay Verma <tanmayv@nvidia.com> Date: Mon Jul 14 20:14:51 2025 -0700 refactor: Refactor the TRTLLM examples remove dynamo SDK (#1884) commit 5bf23d5 Author: hhzhang16 <54051230+hhzhang16@users.noreply.github.com> Date: Mon Jul 14 18:29:19 2025 -0700 feat: update DynamoGraphDeployments for vllm_v1 (#1890) Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu> commit 9e76590 Author: ishandhanani <82981111+ishandhanani@users.noreply.github.com> Date: Mon Jul 14 17:29:56 2025 -0700 docs: organize sglang readme (#1910) commit ef59ac8 Author: KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com> Date: Mon Jul 14 16:16:44 2025 -0700 docs: TRTLLM Example of Llama4+Eagle3 (Speculative Decoding) (#1828) Signed-off-by: KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com> Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com> commit 053041e Author: Jorge António <matroid@outlook.com> Date: Tue Jul 15 00:06:38 2025 +0100 fix: resolve incorrect finish reason propagation (#1857) commit 3733f58 Author: Graham King <grahamk@nvidia.com> Date: Mon Jul 14 19:04:22 2025 -0400 feat(backends): Python llama.cpp engine (#1925) commit 6a1350c Author: Tushar Sharma <tusharma@nvidia.com> Date: Mon Jul 14 14:56:36 2025 -0700 build: minor improvements to sglang dockerfile (#1917) commit e2a619b Author: Neelay Shah <neelays@nvidia.com> Date: Mon Jul 14 14:52:53 2025 -0700 fix: remove environment variable passing (#1911) Signed-off-by: Neelay Shah <neelays@nvidia.com> Co-authored-by: Neelay Shah <neelays@a4u8g-0057.ipp2u2.colossus.nvidia.com> commit 3d17a49 Author: Schwinn Saereesitthipitak <17022745+galletas1712@users.noreply.github.com> Date: Mon Jul 14 14:41:56 2025 -0700 refactor: remove dynamo build (#1778) Signed-off-by: Schwinn Saereesitthipitak <17022745+galletas1712@users.noreply.github.com> commit 3e0cb07 Author: Anant Sharma <anants@nvidia.com> Date: Mon Jul 14 15:43:48 2025 -0400 fix: copy attributions and license to trtllm runtime container (#1916) commit fc36bf5 Author: ishandhanani <82981111+ishandhanani@users.noreply.github.com> Date: Mon Jul 14 12:31:49 2025 -0700 feat: receive kvmetrics from sglang scheduler (#1789) Co-authored-by: zixuanzhang226 <zixuanzhang@bytedance.com> commit df91fce Author: Yan Ru Pei <yanrpei@gmail.com> Date: Mon Jul 14 12:24:04 2025 -0700 feat: prefill aware routing (#1895) commit ad8ad66 Author: Graham King <grahamk@nvidia.com> Date: Mon Jul 14 15:20:35 2025 -0400 feat: Shrink the ai-dynamo wheel by 35 MiB (#1918) Remove http and llmctl binaries. They have been unused for a while. commit 480b41d Author: Graham King <grahamk@nvidia.com> Date: Mon Jul 14 15:06:45 2025 -0400 feat: Python frontend / ingress node (#1912)

Update deploy documentation

9c2015d

atchernych requested review from hutm, biswapanda, ishandhanani, julienmancuso, hhzhang16, nnshah1 and mohammedabdulwahhab as code owners July 14, 2025 21:22

pull-request-size bot added the size/XL label Jul 14, 2025

copy-pr-bot bot temporarily deployed to GITLAB July 14, 2025 21:22 Inactive

github-actions bot added the docs label Jul 14, 2025

copy-pr-bot bot temporarily deployed to GITLAB July 14, 2025 21:23 Inactive

coderabbitai bot reviewed Jul 14, 2025

View reviewed changes

docs/guides/dynamo_deploy/dynamo_cloud.md Outdated Show resolved Hide resolved

Fix deploy/readme

5601212

copy-pr-bot bot temporarily deployed to GITLAB July 14, 2025 21:29 Inactive

hhzhang16 reviewed Jul 14, 2025

View reviewed changes

deploy/README.md Outdated Show resolved Hide resolved

docs/examples/llm_deployment.md Outdated Show resolved Hide resolved

Addressed Hannah's comments

b61e077

copy-pr-bot bot temporarily deployed to GITLAB July 14, 2025 23:52 Inactive

copy-pr-bot bot temporarily deployed to GITLAB July 14, 2025 23:53 Inactive

hhzhang16 reviewed Jul 15, 2025

View reviewed changes

docs/examples/llm_deployment.md Show resolved Hide resolved

Update VLLM_V1 references

ef579e8

atchernych requested review from whoisj and a team as code owners July 15, 2025 17:31

copy-pr-bot bot temporarily deployed to GITLAB July 15, 2025 17:31 Inactive

copy-pr-bot bot temporarily deployed to GITLAB July 15, 2025 17:36 Inactive

Merge remote-tracking branch 'origin/main'

d00d85d

copy-pr-bot bot temporarily deployed to GITLAB July 15, 2025 17:44 Inactive

copy-pr-bot bot temporarily deployed to GITLAB July 15, 2025 17:47 Inactive

merge main

919ac39

atchernych force-pushed the post-merge-doc-cleanup branch from 7cdc556 to 919ac39 Compare July 18, 2025 16:09

copy-pr-bot bot temporarily deployed to GITLAB July 18, 2025 16:09 Inactive

copy-pr-bot bot temporarily deployed to GITLAB July 18, 2025 16:10 Inactive

Merge branch 'main' into post-merge-doc-cleanup

9f21a50

copy-pr-bot bot temporarily deployed to GITLAB July 18, 2025 21:02 Inactive

copy-pr-bot bot temporarily deployed to GITLAB July 18, 2025 21:07 Inactive

Merge branch 'main' into post-merge-doc-cleanup

2d87ef9

copy-pr-bot bot temporarily deployed to GITLAB July 18, 2025 23:33 Inactive

Update deploy.sh

d1cd4bf

Signed-off-by: atchernych <atchernych@nvidia.com>

copy-pr-bot bot temporarily deployed to GITLAB July 19, 2025 01:30 Inactive

Merge branch 'main' into post-merge-doc-cleanup

d835ee4

copy-pr-bot bot temporarily deployed to GITLAB July 19, 2025 04:44 Inactive

Merge branch 'main' into post-merge-doc-cleanup

27d21b4

copy-pr-bot bot temporarily deployed to GITLAB July 20, 2025 22:24 Inactive

julienmancuso approved these changes Jul 21, 2025

View reviewed changes

atchernych merged commit 95dd942 into main Jul 21, 2025
12 of 13 checks passed

atchernych deleted the post-merge-doc-cleanup branch July 21, 2025 19:09

This was referenced Jul 25, 2025

docs: hello world deploy example #2102

Merged

docs: Clean index.rst #2104

Merged

docs: Update the operator docs #2172

Merged

docs: Bug 5424387 #2196

Merged

coderabbitai bot mentioned this pull request Aug 12, 2025

fix: Update quickstart.md #2414

Closed

This was referenced Aug 25, 2025

docs: Simplify sphinx build and table of contents on webpage #2519

Merged

docs: Fix dynamo cloud quickstart links #2765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Post-Merge cleanup of the deploy documentation #1922

docs: Post-Merge cleanup of the deploy documentation #1922

Uh oh!

atchernych commented Jul 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jul 14, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

docs: Post-Merge cleanup of the deploy documentation #1922

docs: Post-Merge cleanup of the deploy documentation #1922

Uh oh!

Conversation

atchernych commented Jul 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 14, 2025

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

atchernych commented Jul 14, 2025 •

edited by coderabbitai bot

Loading