Skip to content

Conversation

biswapanda
Copy link
Contributor

@biswapanda biswapanda commented Jun 15, 2025

Overview:

Added deployment and usage instructions for Inference Gateway with Dynamo on Kubernetes.

1.demo-Inference-Gateway-POL.mp4

Where should the reviewer start?

README (deploy/inference-gateway/example/README.md) document explains steps to deploy Dynamo with inference gateway.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

  • Documentation

    • Added a comprehensive README with step-by-step instructions for deploying and configuring the Inference Gateway with Dynamo on Kubernetes, including setup, verification, and usage examples.
  • New Features

    • Introduced example Kubernetes manifests for deploying the Inference Gateway, inference models, pools, routing, and required RBAC resources.
    • Provided sample service and deployment configurations for running and exposing inference endpoints within a cluster.

@biswapanda biswapanda changed the title doc: add docs for inference gateway deployment docs: add docs for inference gateway deployment Jun 15, 2025
@github-actions github-actions bot added the docs label Jun 15, 2025
Copy link
Contributor

coderabbitai bot commented Jun 15, 2025

Warning

Rate limit exceeded

@biswapanda has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 51 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 5536537 and 53a75f2.

📒 Files selected for processing (1)
  • deploy/inference-gateway/example/README.md (1 hunks)

Walkthrough

New documentation and Kubernetes resource manifests have been added to the deploy/inference-gateway/example directory. These files provide step-by-step instructions and YAML configurations for deploying an Inference Gateway integrated with Dynamo on a Kubernetes cluster, including role-based access, service exposure, model registration, and routing setup.

Changes

File(s) Change Summary
deploy/inference-gateway/example/README.md Added comprehensive deployment and usage instructions for Inference Gateway with Dynamo on Kubernetes.
deploy/inference-gateway/example/resources/cluster-role.yaml
.../cluster-role-binding.yaml
Added ClusterRole and ClusterRoleBinding manifests for RBAC permissions.
deploy/inference-gateway/example/resources/dynamo-epp.yaml Added Deployment manifest for the Dynamo EPP service with configuration and health checks.
deploy/inference-gateway/example/resources/inference-model.yaml Added InferenceModel CRD manifest linking a model to an inference pool.
deploy/inference-gateway/example/resources/inference-pool.yaml Added InferencePool CRD manifest specifying pool configuration and extension reference.
deploy/inference-gateway/example/resources/service.yaml Added Service manifest exposing the Dynamo EPP deployment on port 9002.
deploy/inference-gateway/example/resources/http-router.yaml Added HTTPRoute manifest to route requests to the inference pool backend.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Gateway
    participant HTTPRoute
    participant InferencePool
    participant Service (EPP)
    participant Pod (EPP)

    User->>Gateway: HTTP/2 Inference Request (port 9002)
    Gateway->>HTTPRoute: Match request path
    HTTPRoute->>InferencePool: Route to backend (dynamo-deepseek)
    InferencePool->>Service (EPP): Forward request
    Service (EPP)->>Pod (EPP): Proxy to EPP container
    Pod (EPP)->>Service (EPP): Model inference response
    Service (EPP)->>InferencePool: Return response
    InferencePool->>Gateway: Send response
    Gateway->>User: Return inference result
Loading

Poem

In the warren of clusters, a gateway appears,
With YAML and roles, it conquers our fears.
Models and pools, all ready to run,
Routing requests till the inference is done.
Rabbits rejoice—deployments are neat,
With every new manifest, success tastes sweet!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
deploy/inference-gateway/example/inference-gateway-resources.yaml (2)

58-66: Avoid using the default ServiceAccount for RBAC.
Binding the default SA cluster‐wide is overly permissive.

Define a dedicated ServiceAccount and update the ClusterRoleBinding & Deployment:

+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: dynamo-deepseek-epp
+  namespace: default
---
 kind: ClusterRoleBinding
 subjects:
-  - kind: ServiceAccount
-    name: default
-    namespace: default
+  - kind: ServiceAccount
+    name: dynamo-deepseek-epp
+    namespace: default
 roleRef: ...
...
 spec:
   containers:
+    serviceAccountName: dynamo-deepseek-epp

31-38: Use more specific RBAC resource names.
pod-read is too generic and may collide with other roles.

Consider renaming to dynamo-deepseek-pod-read (and updating references accordingly).

deploy/inference-gateway/example/README.md (6)

7-8: Add article for consistency in prerequisites list.

- - Kubernetes cluster with kubectl configured
+ - A Kubernetes cluster with kubectl configured
🧰 Tools
🪛 LanguageTool

[grammar] ~7-~7: The singular proper name ‘Kubernetes’ must be used with a third-person or a past tense verb.
Context: ...quests. ## Prerequisites - Kubernetes cluster with kubectl configured - NVIDIA GPU dr...

(HE_VERB_AGR)


42-43: Insert missing comma.

- In this example we'll install `kgateway`.
+ In this example, we'll install `kgateway`.
🧰 Tools
🪛 LanguageTool

[typographical] ~42-~42: It appears that a comma is missing.
Context: ...y an inference gateway service. In this example we'll install kgateway. Install the ...

(DURING_THAT_TIME_COMMA)


50-51: Fix punctuation and unify casing for kgateway.

- Deploy an Inference Gateway.  In this example we'll install `Kgateway`
+ Deploy an Inference Gateway.  In this example, we'll install `kgateway`
🧰 Tools
🪛 LanguageTool

[typographical] ~50-~50: It appears that a comma is missing.
Context: ... Deploy an Inference Gateway. In this example we'll install Kgateway ```bash KGTW_V...

(DURING_THAT_TIME_COMMA)


75-75: Hyphenate compound modifier.

- 4. **Apply Dynamo specific manifests**
+ 4. **Apply Dynamo-specific manifests**
🧰 Tools
🪛 LanguageTool

[uncategorized] ~75-~75: When ‘Dynamo-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... True 1m ``` 4. Apply Dynamo specific manifests The Inference Gateway is c...

(SPECIFIC_HYPHEN)


118-124: Normalize heading levels and correct typo.
Heading "4. Usage" should be a markdown heading and fix "Poulate" → "Populate".

- 4. Usage
+ ## Usage
@@
- ### 1: Poulate gateway url for your k8s cluster
+ ### 1. Populate gateway URL for your k8s cluster
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

124-124: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)


146-146: Specify language for fenced code block.

- ```
+ ```bash
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

146-146: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13a99b7 and 032f270.

📒 Files selected for processing (2)
  • deploy/inference-gateway/example/README.md (1 hunks)
  • deploy/inference-gateway/example/inference-gateway-resources.yaml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
deploy/inference-gateway/example/README.md

[grammar] ~7-~7: The singular proper name ‘Kubernetes’ must be used with a third-person or a past tense verb.
Context: ...quests. ## Prerequisites - Kubernetes cluster with kubectl configured - NVIDIA GPU dr...

(HE_VERB_AGR)


[typographical] ~42-~42: It appears that a comma is missing.
Context: ...y an inference gateway service. In this example we'll install kgateway. Install the ...

(DURING_THAT_TIME_COMMA)


[typographical] ~50-~50: It appears that a comma is missing.
Context: ... Deploy an Inference Gateway. In this example we'll install Kgateway ```bash KGTW_V...

(DURING_THAT_TIME_COMMA)


[uncategorized] ~75-~75: When ‘Dynamo-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... True 1m ``` 4. Apply Dynamo specific manifests The Inference Gateway is c...

(SPECIFIC_HYPHEN)


[uncategorized] ~129-~129: Possible missing comma found.
Context: ...ateway-URL> To test the gateway in minikube use the following:bash minikube tun...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)
deploy/inference-gateway/example/README.md

124-124: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)


146-146: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🪛 Checkov (3.2.334)
deploy/inference-gateway/example/inference-gateway-resources.yaml

[MEDIUM] 119-171: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 119-171: Minimize the admission of root containers

(CKV_K8S_23)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/1533/merge) by biswapanda.
deploy/inference-gateway/example/inference-gateway-resources.yaml

[error] 16-30: YAML check failed: expected a single document in the stream but found another document starting at line 30, column 1.

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (13)
deploy/inference-gateway/example/resources/cluster-role.yaml (2)

19-27: Consider consolidating duplicate rules for the same apiGroup.
You can merge the two inference.networking.x-k8s.io entries into one to reduce redundancy:

- - apiGroups: ["inference.networking.x-k8s.io"]
-   resources: ["inferencepools"]
-   verbs: ["get", "watch", "list"]
- - apiGroups: ["inference.networking.x-k8s.io"]
-   resources: ["inferencemodels"]
-   verbs: ["get", "watch", "list"]
+ - apiGroups: ["inference.networking.x-k8s.io"]
+   resources: ["inferencepools", "inferencemodels"]
+   verbs: ["get", "watch", "list"]

40-40: Add a trailing newline.
A newline at the end of the file will satisfy YAML lint rules.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 40-40: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/service.yaml (1)

28-28: Add a trailing newline.
Append a newline to satisfy YAML lint.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 28-28: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/http-router.yaml (1)

34-34: Add a trailing newline.
Include a final newline to pass YAML linting.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/inference-model.yaml (1)

26-26: Add a trailing newline.
A final newline satisfies YAML lint requirements.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 26-26: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/inference-pool.yaml (1)

28-28: Add a trailing newline.
Include a newline at EOF to satisfy the linter.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 28-28: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/dynamo-epp.yaml (1)

34-50: Enhance container security
To satisfy best practices and address Checkov warnings, add a minimal securityContext to drop privileges and enforce non-root execution:

 containers:
 - name: epp
+  securityContext:
+    runAsNonRoot: true
+    allowPrivilegeEscalation: false
   image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main
   imagePullPolicy: Always
   args:
deploy/inference-gateway/example/README.md (6)

7-7: Clarify prerequisite wording
Change to:

- A Kubernetes cluster with kubectl installed and configured

for consistency and to specify both installation and configuration steps.

🧰 Tools
🪛 LanguageTool

[grammar] ~7-~7: The singular proper name ‘Kubernetes’ must be used with a third-person or a past tense verb.
Context: ...quests. ## Prerequisites - Kubernetes cluster with kubectl configured - NVIDIA GPU dr...

(HE_VERB_AGR)


42-42: Add missing comma & unify casing
Should read:

First, deploy an inference gateway service. In this example, we'll install `kgateway`.

(use lowercase “kgateway” consistently).

🧰 Tools
🪛 LanguageTool

[typographical] ~42-~42: It appears that a comma is missing.
Context: ...y an inference gateway service. In this example we'll install kgateway. Install the ...

(DURING_THAT_TIME_COMMA)


50-50: Add missing comma & unify casing
Should read:

Deploy an Inference Gateway. In this example, we'll install `kgateway`.

[matches prior casing and comma style]

🧰 Tools
🪛 LanguageTool

[typographical] ~50-~50: It appears that a comma is missing.
Context: ... Deploy an Inference Gateway. In this example we'll install Kgateway ```bash KGTW_V...

(DURING_THAT_TIME_COMMA)


75-75: Hyphenate compound modifier
Update the heading to:

4. **Apply Dynamo-specific manifests**

to correctly hyphenate “Dynamo-specific.”

🧰 Tools
🪛 LanguageTool

[uncategorized] ~75-~75: When ‘Dynamo-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... True 1m ``` 4. Apply Dynamo specific manifests The Inference Gateway is c...

(SPECIFIC_HYPHEN)


129-129: Add missing comma after introductory phrase
Should read:

To test the gateway in minikube, use the following:

for grammatical correctness.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~129-~129: Possible missing comma found.
Context: ...ateway-URL> To test the gateway in minikube use the following:bash minikube tun...

(AI_HYDRA_LEO_MISSING_COMMA)


146-146: Specify code block language
Add a language identifier to the fenced block (e.g., ```bash) to satisfy Markdown linting:

- ```
+ ```bash
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

146-146: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 032f270 and a69b4b4.

📒 Files selected for processing (8)
  • deploy/inference-gateway/example/README.md (1 hunks)
  • deploy/inference-gateway/example/resources/cluster-role-binding.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/cluster-role.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/dynamo-epp.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/http-router.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/inference-model.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/inference-pool.yaml (1 hunks)
  • deploy/inference-gateway/example/resources/service.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • deploy/inference-gateway/example/resources/cluster-role-binding.yaml
🧰 Additional context used
🪛 LanguageTool
deploy/inference-gateway/example/README.md

[grammar] ~7-~7: The singular proper name ‘Kubernetes’ must be used with a third-person or a past tense verb.
Context: ...quests. ## Prerequisites - Kubernetes cluster with kubectl configured - NVIDIA GPU dr...

(HE_VERB_AGR)


[typographical] ~42-~42: It appears that a comma is missing.
Context: ...y an inference gateway service. In this example we'll install kgateway. Install the ...

(DURING_THAT_TIME_COMMA)


[typographical] ~50-~50: It appears that a comma is missing.
Context: ... Deploy an Inference Gateway. In this example we'll install Kgateway ```bash KGTW_V...

(DURING_THAT_TIME_COMMA)


[uncategorized] ~75-~75: When ‘Dynamo-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... True 1m ``` 4. Apply Dynamo specific manifests The Inference Gateway is c...

(SPECIFIC_HYPHEN)


[uncategorized] ~129-~129: Possible missing comma found.
Context: ...ateway-URL> To test the gateway in minikube use the following:bash minikube tun...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)
deploy/inference-gateway/example/README.md

124-124: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)


146-146: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🪛 YAMLlint (1.37.1)
deploy/inference-gateway/example/resources/cluster-role.yaml

[error] 40-40: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/http-router.yaml

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/inference-model.yaml

[error] 26-26: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/inference-pool.yaml

[error] 28-28: no new line character at the end of file

(new-line-at-end-of-file)

deploy/inference-gateway/example/resources/service.yaml

[error] 28-28: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Checkov (3.2.334)
deploy/inference-gateway/example/resources/dynamo-epp.yaml

[MEDIUM] 15-67: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 15-67: Minimize the admission of root containers

(CKV_K8S_23)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (8)
deploy/inference-gateway/example/resources/cluster-role.yaml (2)

15-18: ClusterRole metadata is correctly defined.
The apiVersion, kind, and metadata.name (pod-read) align with expected RBAC conventions.


29-34: Verify authentication/authorization review permissions.
Ensure the tokenreviews and subjectaccessreviews create verbs match your gateway’s auth flow. If using SubjectAccessReview or TokenReview in the controller, confirm the ServiceAccount has these rights.

deploy/inference-gateway/example/resources/service.yaml (1)

21-23: Ensure Service selector matches the Deployment labels.
The Service selects pods with app: dynamo-deepseek-epp—confirm the dynamo-epp.yaml Deployment pod template uses the same label.

deploy/inference-gateway/example/resources/http-router.yaml (1)

20-24: Explicit namespace in parentRefs if Gateway is non-default.
By default, this HTTPRoute targets a Gateway in its own namespace. If inference-gateway lives elsewhere, add namespace: <namespace> under parentRefs.

deploy/inference-gateway/example/resources/inference-model.yaml (1)

15-19: InferenceModel declaration is valid.
API version, kind, metadata name/namespace, and spec fields look correct for linking to the dynamo-deepseek pool.

deploy/inference-gateway/example/resources/dynamo-epp.yaml (3)

15-22: Metadata consistency looks good
The Deployment’s name, namespace, and label selectors align correctly, ensuring proper pod targeting.

🧰 Tools
🪛 Checkov (3.2.334)

[MEDIUM] 15-67: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 15-67: Minimize the admission of root containers

(CKV_K8S_23)


28-33: Grace period consistency verified
terminationGracePeriodSeconds: 130 appropriately mirrors the longest pod shutdown time in the inference pool.


51-67: Probes and ports configuration is correct
The gRPC-based liveness/readiness probes on port 9003 and the metrics port are well-defined with sensible delays.

🧰 Tools
🪛 Checkov (3.2.334)

[MEDIUM] 15-67: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 15-67: Minimize the admission of root containers

(CKV_K8S_23)

@biswapanda biswapanda enabled auto-merge (squash) June 16, 2025 04:52
Copy link
Contributor

@mohammedabdulwahhab mohammedabdulwahhab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it reads well but needs a pass to fix the headings and subheadings.

Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
deploy/inference-gateway/example/README.md (5)

26-26: Remove trailing whitespace
Pre-commit hook flagged a trailing whitespace on this line. Please run pre-commit run --all-files or manually remove the extra space to resolve the pipeline failure.


75-75: Use hyphen in compound modifier
Change “Dynamo specific manifests” to “Dynamo-specific manifests” for correct hyphenation.


124-124: Fix spelling and casing in header
Correct “Poulate gateway url for your k8s cluster” to “Populate Gateway URL for your Kubernetes cluster”.


129-129: Add missing comma
Insert a comma after “minikube” for proper punctuation:

-To test the gateway in minikube use the following:
+To test the gateway in minikube, use the following:

146-146: Specify code block language
The fenced code block here has no language. Add “bash” for syntax highlighting:

-```
+```bash
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a69b4b4 and 5536537.

📒 Files selected for processing (1)
  • deploy/inference-gateway/example/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
deploy/inference-gateway/example/README.md

[grammar] ~7-~7: The singular proper name ‘Kubernetes’ must be used with a third-person or a past tense verb.
Context: ...quests. ## Prerequisites - Kubernetes cluster with kubectl configured - NVIDIA GPU dr...

(HE_VERB_AGR)


[typographical] ~42-~42: It appears that a comma is missing.
Context: ...y an inference gateway service. In this example we'll install kgateway. Install the ...

(DURING_THAT_TIME_COMMA)


[typographical] ~50-~50: It appears that a comma is missing.
Context: ... Deploy an Inference Gateway. In this example we'll install Kgateway ```bash KGTW_V...

(DURING_THAT_TIME_COMMA)


[uncategorized] ~75-~75: When ‘Dynamo-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... True 1m ``` 4. Apply Dynamo specific manifests The Inference Gateway is c...

(SPECIFIC_HYPHEN)


[uncategorized] ~129-~129: Possible missing comma found.
Context: ...ateway-URL> To test the gateway in minikube use the following:bash minikube tun...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 markdownlint-cli2 (0.17.2)
deploy/inference-gateway/example/README.md

146-146: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/1533/merge) by biswapanda.
deploy/inference-gateway/example/README.md

[error] 26-26: Trailing whitespace detected and fixed by pre-commit hook 'trailing-whitespace'. Please run 'pre-commit run --all-files' locally to fix.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build and Test - vllm
  • GitHub Check: Mirror Repository to GitLab

@biswapanda biswapanda self-assigned this Jun 17, 2025
@biswapanda biswapanda changed the title docs: add docs for inference gateway deployment docs: add docs for experimental inference gateway deployment Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants