-
Notifications
You must be signed in to change notification settings - Fork 585
feat(dynamo-llm): Remove bring-your-own-engine #1216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@tanmayv25 for visibility |
528f7de
to
006b9c9
Compare
006b9c9
to
197ed55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes the legacy "bring-your-own-engine" Python support and associated dynamo-run
flags, aligning the codebase with the standalone Python engine guide introduced in v0.2.1.
- Deletes the
dynamo-engine-python
crate and its Rust bindings - Removes
PythonStr
variants and thepython
feature fromdynamo-run
- Updates the CLI usage string to drop the old
dyn://<path>
notation
Reviewed Changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
lib/engines/python/src/lib.rs | Removed entire legacy Python engine implementation |
lib/engines/python/Cargo.toml | Deleted Python engine crate manifest |
lib/bindings/python/rust/engine.rs | Moved and duplicated Python engine code into bindings |
lib/bindings/python/Cargo.toml | Removed path dependency on Python engine crate |
launch/dynamo-run/src/opt.rs | Removed PythonStr enum variant and parsing logic |
launch/dynamo-run/src/main.rs | Updated USAGE constant to drop dyn://<path> |
launch/dynamo-run/src/lib.rs | Deleted PYTHON_STR_SCHEME and related match arm |
launch/dynamo-run/Cargo.toml | Removed optional python feature |
Files not reviewed (4)
- Earthfile: Language not supported
- container/Dockerfile.none: Language not supported
- container/Dockerfile.tensorrt_llm: Language not supported
- container/Dockerfile.vllm: Language not supported
Comments suppressed due to low confidence (2)
launch/dynamo-run/src/main.rs:33
- The
USAGE
string no longer mentionsdyn://<path>
but still showsdyn
. To avoid confusion, restore thedyn://<path>
notation or clarify thatdyn
impliesdyn://<path>
.
const USAGE: &str = "USAGE: dynamo-run in=[http|text|dyn://<path>|batch:<folder>] out=ENGINE_LIST|dyn [--...];
lib/bindings/python/rust/engine.rs:112
- [nitpick] The
PythonServerStreamingEngine
implementation is duplicated here and was removed from thedynamo-engine-python
crate. Consider extracting this shared logic into a common module to avoid code duplication and ease maintenance.
pub struct PythonServerStreamingEngine {
It was removed from the docs in 0.2.1 and replaced with writing a [standalone Python engine](https://github.com/ai-dynamo/dynamo/blob/main/docs/guides/dynamo_run.md#writing-your-own-engine-in-python). Also remove the associated `dynamo-run` feature `python`. Releasing this in 0.3.0 will resolve #784 and #1109.
197ed55
to
57ae256
Compare
WalkthroughThis change removes support for the "python" engine and all related code, dependencies, and build features from the project. It eliminates the Python engine integration, corresponding Cargo features, and Docker build options, while also updating documentation and usage strings to reflect the removal. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant dynamo-run
participant Rust Engines
User->>dynamo-run: Run inference command
dynamo-run->>Rust Engines: Select engine (Python option removed)
Rust Engines-->>dynamo-run: Engine runs (Python engine unavailable)
dynamo-run-->>User: Return inference results
Assessment against linked issues
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
container/Dockerfile.vllm (1)
371-377
: Double-check the scoped-feature syntax for Cargo (crate/feature)
--features dynamo-llm/block-manager
is valid only on Cargo ≥ 1.64 and when a workspace build is requested (which you do).
However, because you subsequently restrict the build to a subset of packages with-p …
, Cargo occasionally ignores per-crate feature flags (see rust-lang/cargo#9779). If you notice theblock-manager
code path not being compiled indynamo-llm
, split the command:cargo build --release --locked -p dynamo-llm --features block-manager cargo build --release --locked \ -p dynamo-run -p llmctl -p file://$PWD/components/http -p metrics
This guarantees the feature ends up on the right crate without relying on Cargo’s global resolution.
lib/bindings/python/rust/engine.rs (2)
151-168
: Potential GIL contention: always spawning a blocking thread is expensiveEvery call to
generate()
spawns a new blocking task just to acquire the GIL. Under high QPS this can exhaust the blocking‐thread pool and starve other tasks.Consider:
-let stream = tokio::task::spawn_blocking(move || { +let stream = tokio::task::spawn_blocking(move || { Python::with_gil(|py| { … }) }) .await??;• Re-using
pyo3_asyncio::tokio::into_stream
without an extraspawn_blocking
when the current thread already holds the GIL.
• Or create a dedicatedtokio::runtime::Handle
for Python work with a bounded thread-count.This will reduce context-switch overhead and latency.
282-288
: Per-itemspawn_blocking
can become the bottleneck
process_item
converts every message viatokio::task::spawn_blocking
, potentially queuing thousands of short-lived tasks. A lower-overhead alternative is to run the entire depythonize loop inside the existing blocking task that already holds the GIL, pushing deserialised objects through the channel.Example skeleton:
let (tx, rx) = mpsc::channel(128); tokio::task::spawn_blocking(move || Python::with_gil(|py| { for item in stream { // iterate in Python land match depythonize::<Resp>(&item)? { … } if tx.blocking_send(annotated).is_err() { break; } } }));Reduces two context switches per message and preserves back-pressure semantics.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Cargo.lock
is excluded by!**/*.lock
lib/bindings/python/Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (12)
Earthfile
(1 hunks)container/Dockerfile.none
(1 hunks)container/Dockerfile.tensorrt_llm
(1 hunks)container/Dockerfile.vllm
(1 hunks)launch/dynamo-run/Cargo.toml
(0 hunks)launch/dynamo-run/src/lib.rs
(0 hunks)launch/dynamo-run/src/main.rs
(1 hunks)launch/dynamo-run/src/opt.rs
(0 hunks)lib/bindings/python/Cargo.toml
(1 hunks)lib/bindings/python/rust/engine.rs
(2 hunks)lib/engines/python/Cargo.toml
(0 hunks)lib/engines/python/src/lib.rs
(0 hunks)
💤 Files with no reviewable changes (5)
- launch/dynamo-run/src/lib.rs
- lib/engines/python/Cargo.toml
- launch/dynamo-run/src/opt.rs
- launch/dynamo-run/Cargo.toml
- lib/engines/python/src/lib.rs
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (7)
launch/dynamo-run/src/main.rs (2)
29-29
: LGTM: Example model updated.The model name change in the example usage is appropriate and keeps the documentation current.
33-33
: LGTM: Usage string updated to reflect Python engine removal.The removal of
"dyn://<path>"
syntax from the usage string is consistent with removing the bring-your-own-engine Python support.lib/bindings/python/Cargo.toml (1)
44-47
: LGTM: Dependencies updated for new Rust-based Python engine.The addition of
anyhow
,async-stream
,async-trait
, andasync-openai
dependencies aligns with implementing a new Rust-native async engine to replace the external Python engine support.Earthfile (1)
115-115
: LGTM: Python feature removed from build configuration.The removal of the
python
feature from the cargo build command is consistent with eliminating bring-your-own-engine Python support while maintaining other necessary features.container/Dockerfile.tensorrt_llm (1)
278-283
:✅ Verification successful
Verify that mistralrs feature removal doesn't break TensorRT-LLM functionality.
The cargo build command now runs without any features, which removes both
python
andmistralrs
features. While removingpython
aligns with the PR objectives, please confirm that themistralrs
feature is not required for TensorRT-LLM builds.
🏁 Script executed:
#!/bin/bash # Description: Check if mistralrs feature is required for TensorRT-LLM builds # Expected: Find references to mistralrs in TensorRT-LLM related code or configs # Search for mistralrs references in TensorRT-LLM context rg -i "mistralrs|mistral.*rs" --type dockerfile rg -i "mistralrs|mistral.*rs" container/ rg -i "tensorrt.*mistral|mistral.*tensorrt" # Check if mistralrs is conditionally compiled for tensorrt ast-grep --pattern 'cfg(feature = "mistralrs")'Length of output: 757
🏁 Script executed:
#!/bin/bash # Inspect Cargo.toml for mistralrs feature definitions and default features # 1. Find all occurrences of 'mistralrs' in TOML files rg -n "mistralrs" --glob "*.toml" # 2. Locate and display [features] sections in crates 'dynamo-run' and 'llmctl' rg -l 'name = "dynamo-run"' -R "*.toml" | xargs -I{} sh -c 'echo "== {} =="; sed -n "/^\[features\]/,/^\[/{p}" {}' rg -l 'name = "llmctl"' -R "*.toml" | xargs -I{} sh -c 'echo "== {} =="; sed -n "/^\[features\]/,/^\[/{p}" {}' # 3. Check for any use of default-features toggles rg -n "default-features" --glob "*.toml"Length of output: 2931
Mistralrs remains enabled via default features
The
cargo build
incontainer/Dockerfile.tensorrt_llm
no longer passes an explicit--features
flag, but it still picks up the crate’s default features. Inlaunch/dynamo-run/Cargo.toml
, the default features list is set to["mistralrs"]
, so themistralrs
engine is still compiled in without any additional flags. No further changes are needed here.container/Dockerfile.none (1)
49-55
:❌ Incorrect review comment
Consider passing feature flags or
--workspace
to avoid an accidental partial build
cargo build --release --locked
now relies on the default feature-set that each crate enables.
If any binaries (dynamo-run
,http
,llmctl
,metrics
,mock_worker
) still expect theblock-manager
or other opt-in features, the build will silently omit them and the resulting binaries may be missing symbols at runtime.-RUN cargo build --release --locked && \ +RUN cargo build --release --locked --workspace --features dynamo-llm/block-manager && \Please verify with a clean build that:
- All five binaries still compile successfully.
- Integration tests exercising removed
python
functionality are not compiled in.
🏁 Script executed:
#!/bin/bash # Display workspace section from root Cargo.toml echo "=== Workspace Section ===" awk 'BEGIN{in_ws=0} /^\[workspace\]/{in_ws=1} in_ws{print} /^\[/{if(in_ws && !/\[workspace\]/) exit}' Cargo.toml echo # Show feature definitions in root Cargo.toml echo "=== Root Features ===" grep -R "^features" -n Cargo.toml || echo "No features block in root Cargo.toml" echo # Find any 'block-manager' references (feature or code) echo "=== block-manager Occurrences ===" rg -n "block-manager" .Length of output: 2598
🏁 Script executed:
#!/bin/bash # Display first 80 lines of container/Dockerfile.none with line numbers sed -n '1,80p' -n container/Dockerfile.none | sed = | sed 'N; s/\n/ /'Length of output: 2123
Ignore feature flags for the “none” variant Dockerfile
The
container/Dockerfile.none
is explicitly the “no-features” build (as indicated by its.none
suffix), meant to produce a minimal set of binaries without any optional components. Adding--workspace
or--features dynamo-llm/block-manager
here would defeat the purpose of this variant. The other Dockerfiles (.sglang
,.vllm
) already include the appropriate feature flags in their own build stages.Likely an incorrect or invalid review comment.
lib/bindings/python/rust/engine.rs (1)
204-257
:⚠️ Potential issue
println!
leaks to stdout and blocks – use structured logging insteadInside the hot loop you mix
tracing::trace!
and a bareprintln!
, which:
- Bypasses the configured tracing subscriber.
- Performs a blocking write to stdout for every Python exception.
Replace with
tracing::error!
to integrate with the existing log pipeline.- println!(); - Python::with_gil(|py| e.display(py)); + tracing::error!("python exception: {}", e); + Python::with_gil(|py| e.display(py));Likely an incorrect or invalid review comment.
It was removed from the docs in 0.2.1 and replaced with writing a [standalone Python engine](https://github.com/ai-dynamo/dynamo/blob/main/docs/guides/dynamo_run.md#writing-your-own-engine-in-python). Also remove the associated `dynamo-run` feature `python`. Releasing this in 0.3.0 will resolve #784 and #1109.
It was removed from the docs in 0.2.1 and replaced with writing a standalone Python engine.
Also remove the associated
dynamo-run
featurepython
.Releasing this in 0.3.0 will resolve #784 and #1109.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores