-
Notifications
You must be signed in to change notification settings - Fork 586
docs: Update Multimodal Example README #1275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi whoisj! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughThe README for multimodal deployment examples was updated to clarify worker naming, specifying "decode_worker" instead of "vllm_worker". Communication details were enhanced, outlining the use of NATS and RDMA for embedding transfers. Flow diagrams and formatting were revised for consistency and accuracy, with no changes to code or APIs. Changes
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/multimodal/README.md
(4 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/multimodal/README.md
[grammar] ~37-~37: The verb ‘encode’ does not usually follow articles like ‘the’. Check that ‘encode’ is spelled correctly; using ‘encode’ as a noun may be non-standard.
Context: ..../llm/README.md) example. By separating the encode from the prefill and decode stages, we ...
(A_INFINITIVE)
[grammar] ~38-~38: The usual collocation for “independently” is “of”, not “from”. Did you mean “independently of”?
Context: ... deployment and scale the encode worker independently from the prefill and decode workers if neede...
(INDEPENDENTLY_FROM_OF)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Test - vllm
🔇 Additional comments (10)
examples/multimodal/README.md (10)
27-27
: Corrected worker terminology for aggregated serving
The update replaces the ambiguous “vllm worker” label with explicit references todecode_worker
, improving clarity.
29-29
: Front-end bullet is clear and consistent
Thefrontend
component is properly described as the HTTP endpoint. No issues found.
33-36
: Updated deployment description for aggregated serving
The narrative clearly outlines the role ofencode_worker
anddecode_worker
, along with NATS + RDMA for data/embeddings transfer.
41-49
: Mermaid diagram replaces ASCII flow
Switching to a Mermaid flowchart improves readability on GitHub. The syntax and arrow directions correctly reflect the described pipeline.
64-84
: Formatted JSON request sample
The multi-line JSON payload is well-indented and uses proper quoting inside a single-quotedcurl -d
block.
96-98
: Corrected worker list for disaggregated serving
The bullet list accurately enumeratesencode_worker
,decode_worker
, andprefill_worker
, and thefrontend
is still clearly defined.
102-106
: Updated deployment narrative for disaggregated serving
The roles of the three workers and the data flow via NATS + RDMA are clearly articulated.
110-119
: Mermaid diagram for disaggregated flow
The new Mermaid block correctly visualizes the extended pipeline (includingprefill_worker
).
131-154
: Formatted JSON request sample (disaggregated)
Consistent with the aggregated example, the JSON payload is correctly formatted forcurl
.
158-158
: Language tag for JSON response block
Changing the response example to ajson
code fence enables syntax highlighting, which is a nice documentation enhancement.
17dd623
to
5eb7d7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
examples/multimodal/README.md (1)
37-38
: Consider refining sentence for clarity
Instead of:
“By separating the encode from the prefill and decode stages, we can have a more flexible deployment and scale the encode worker independently from the prefill and decode workers if needed.”Consider:
- “By separating encoding from the prefill and decode stages, …”
- Use “independently of” instead of “independently from.”
🧰 Tools
🪛 LanguageTool
[grammar] ~37-~37: The verb ‘encode’ does not usually follow articles like ‘the’. Check that ‘encode’ is spelled correctly; using ‘encode’ as a noun may be non-standard.
Context: ..../llm/README.md) example. By separating the encode from the prefill and decode stages, we ...(A_INFINITIVE)
[grammar] ~38-~38: The usual collocation for “independently” is “of”, not “from”. Did you mean “independently of”?
Context: ... deployment and scale the encode worker independently from the prefill and decode workers if neede...(INDEPENDENTLY_FROM_OF)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/multimodal/README.md
(4 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/multimodal/README.md
[grammar] ~37-~37: The verb ‘encode’ does not usually follow articles like ‘the’. Check that ‘encode’ is spelled correctly; using ‘encode’ as a noun may be non-standard.
Context: ..../llm/README.md) example. By separating the encode from the prefill and decode stages, we ...
(A_INFINITIVE)
[grammar] ~38-~38: The usual collocation for “independently” is “of”, not “from”. Did you mean “independently of”?
Context: ... deployment and scale the encode worker independently from the prefill and decode workers if neede...
(INDEPENDENTLY_FROM_OF)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Test - vllm
🔇 Additional comments (7)
examples/multimodal/README.md (7)
27-28
: Terminology updated correctly
The references tovllm_worker
have been successfully replaced withdecode_worker
in both the workers and processor bullets.
33-36
: Aggregated deployment description is clear and accurate
The steps correctly describe howencode_worker
anddecode_worker
interact via NATS and RDMA, and the terminology aligns with the rest of the document.
41-49
: Mermaid diagram for aggregated serving looks correct
The flowchart accurately represents the interactions and is valid Mermaid syntax.
64-84
: Client request example formatting is consistent
Thecurl
command and embedded JSON payload are well-formatted and clear.
96-99
: Disaggregated components list is consistent
The bullets correctly enumerateencode_worker
,decode_worker
, andprefill_worker
, with updated terminology throughout.
111-120
: Mermaid diagram for disaggregated serving is accurate
The flowchart correctly showsencode_worker
→prefill_worker
→decode_worker
along with the other interactions.
158-160
: Response example uses proper JSON block
Labeling the response snippet withjson
enhances readability and clarity.
This change corrects the README.md file in the examples/multimodal folder: - Correct "vllm worker" to "decode worker" - Correct assertion that data is moved via NATS when embeddings are moved via RDMA. Additionally, this change updates the textual graphs with Mermaid graphs for improved presentation on github.com.
5eb7d7e
to
0c4a230
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/multimodal/README.md
(3 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/multimodal/README.md
[grammar] ~37-~37: The verb ‘encode’ does not usually follow articles like ‘the’. Check that ‘encode’ is spelled correctly; using ‘encode’ as a noun may be non-standard.
Context: ..../llm/README.md) example. By separating the encode from the prefill and decode stages, we ...
(A_INFINITIVE)
[grammar] ~38-~38: The usual collocation for “independently” is “of”, not “from”. Did you mean “independently of”?
Context: ... deployment and scale the encode worker independently from the prefill and decode workers if neede...
(INDEPENDENTLY_FROM_OF)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Test - vllm
🔇 Additional comments (6)
examples/multimodal/README.md (6)
27-29
: Terminology updated todecode_worker
The component bullets now consistently replace “vllm_worker” with “decode_worker” in the aggregated section.
33-36
: Aggregated deployment description is clear
The narrative correctly outlines the roles ofencode_worker
anddecode_worker
, and specifies NATS/RDMA usage.
64-77
: Aggregated client JSON formatting is correct
Thecurl
request JSON is properly formatted within the bash snippet, using single quotes for shell and double quotes for JSON.Also applies to: 79-84
96-97
: Disaggregated terminology updated todecode_worker
The components list consistently usesdecode_worker
in place of the old generic term.
102-102
: Disaggregated deployment description is accurate
The text now correctly describes the encode→prefill→decode workflow and NATS/RDMA communication channels.Also applies to: 104-106
134-146
: Disaggregated client JSON formatting is correct
Thecurl
example JSON is well-formatted for shell use and consistent with the aggregated example.Also applies to: 149-154
This change corrects the README.md file in the examples/multimodal folder:
Additionally, this change updates the textual graphs with Mermaid graphs for improved presentation on github.com.
DIS-100
Summary by CodeRabbit