Skip to content

Conversation

nv-tusharma
Copy link
Contributor

@nv-tusharma nv-tusharma commented Jul 3, 2025

Overview:

This PR introduces a slim sglang runtime image which is currently around (~12 GB) in size. Will double confirm before merging.

dynamo:latest-sglang-runtime                503fb20b007c   9 minutes ago       11.7GB

Details:

  • Install NIXL & sglang[runtime-common] in the runtime stage. We also have to avoid installing flashinfer-python (transitive dependency from sglang[all]) since this requires the nvcc to exist in the container. However, this is not a required dependency for how sglang is used in Dynamo.
  • Install and copy NIXL artifacts into the final runtime stage
  • Install build-essential and libnuma-dev as runtime dependencies for sglang and NIXL.

agg and disagg examples are passing

Where should the reviewer start?

  • container/Dockerfile.sglang

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@nv-tusharma nv-tusharma changed the title [WIP]: Add support for sglang runtime image build: Add support for sglang runtime image Jul 11, 2025
@nv-tusharma nv-tusharma marked this pull request as ready for review July 11, 2025 00:13
@github-actions github-actions bot added the build label Jul 11, 2025
Copy link
Contributor

coderabbitai bot commented Jul 11, 2025

Walkthrough

The Dockerfile for the sglang container was refactored to change how Python wheels are built and installed, streamline development and runtime environments, update system and Python dependencies, adjust file copying and environment variables, and modify the container entrypoint. Minor comments and TODOs were added.

Changes

File(s) Change Summary
container/Dockerfile.sglang Refactored wheel build/install process, updated system and Python dependencies, changed file copy structure, set new environment variables, switched entrypoint, and added comments/TODOs.

Sequence Diagram(s)

sequenceDiagram
    participant Builder as Build Stage
    participant Runtime as Runtime Stage

    Builder->>Builder: Build Python wheels (nixl, ai-dynamo, ai-dynamo-runtime)
    Builder->>Builder: Install sglang in editable mode
    Builder->>Builder: Copy NATS, etcd, UCX, NIXL binaries/libs

    Builder->>Runtime: Copy built wheels and binaries/libs
    Runtime->>Runtime: Install system packages (build-essential, libnuma-dev, python3-dev)
    Runtime->>Runtime: Install Python wheels from wheelhouse
    Runtime->>Runtime: Install additional Python packages (sglang, einops, sgl-kernel, sentencepiece)
    Runtime->>Runtime: Set environment variables (plugin dirs, library paths, PYTHONPATH)
    Runtime->>Runtime: Set entrypoint to nvidia_entrypoint.sh
Loading

Possibly related PRs

Poem

In a Docker warren, wheels now spin anew,
With binaries and libraries, the workspace grew.
Editable installs and dependencies align,
A nvidia entrypoint—how divine!
🐇✨ Wheels in the burrow, code running fast,
SGLang containers, streamlined at last!


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
container/Dockerfile.sglang (6)

130-134: Build wheel in wheel_builder once, then COPY – don't re-compile in base

The wheel is already needed later in the wheel_builder stage.
Re-building it here lengthens the build, duplicates Rust/Python compile time and keeps heavy build deps in base. Move the two commands below into the wheel_builder stage and simply COPY the wheel into base (and runtime).

-# Install NIXL Python module
-RUN cd /opt/nixl && uv build . --out-dir /workspace/wheels/nixl
-# Install the wheel
-# TODO: Move NIXL wheel install to the wheel_builder stage
-RUN uv pip install /workspace/wheels/nixl/*.whl
+# NIXL wheel will be produced in wheel_builder and copied in

401-405: Heavy build tool-chain left inside runtime image

build-essential and python3-dev add ~150 MB; they are only required to compile wheels.
Because wheels are pre-built in wheel_builder, drop them here to slim the 11 GB image, or install them with --no-install-recommends and remove immediately after uv pip install … to keep only the shared libs actually needed at runtime.


415-420: Over-broad package pins inflate image and may overwrite transitive deps

uv pip install einops and sentencepiece pull the latest versions each build.
Freeze to known-good versions (ideally via requirements.runtime.txt) to avoid silent breakage and improve layer caching.


430-434: Copying entire repo into runtime bloats layer & leaks build secrets

COPY . /workspace brings git history, CI config, tests, etc. into the runtime image.
Restrict the copy to what the service truly needs (e.g., examples/sglang, deploy/sdk/src) or remove .git, docs, and CI artefacts afterwards.


410-413: Symlinking every venv binary can shadow system tools

ln -sf $VIRTUAL_ENV/bin/* /usr/local/bin/ may replace python, pip, or libc-provided binaries, making debugging harder. Link only the CLI entry-points you need (dynamo-run, llmctl, …).


384-389: Cache NATS/etcd layers or use distro packages

Downloading tarballs every build burns cache. Consider:

  1. Installing nats-server & etcd via apt (already available in Ubuntu 24.04), or
  2. Adding --mount=type=cache,target=/var/cache/apt to preserve the downloaded .deb/.tar.gz.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1704b12 and 279939f.

📒 Files selected for processing (1)
  • container/Dockerfile.sglang (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
container/Dockerfile.sglang (1)
Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:230-244
Timestamp: 2025-07-03T10:14:30.570Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, background processes (like nats-server, etcd) are intentionally left running even if later processes fail. This design choice allows users to manually connect to nodes and debug issues without having to restart the entire SLURM job from scratch, providing operational flexibility for troubleshooting in cluster environments.
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/1770/merge) by nv-tusharma.
container/Dockerfile.sglang

[error] 412-412: Pre-commit hook 'trailing-whitespace' failed. Trailing whitespace was fixed in this file. Run 'pre-commit run --all-files' locally to reproduce.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - vllm
🔇 Additional comments (1)
container/Dockerfile.sglang (1)

1-436: Trailing-whitespace pre-commit hook is failing

The CI log shows pre-commit hook 'trailing-whitespace' failed.
Run pre-commit run --all-files locally or docker run --rm -v $PWD:/code pre-commit/pre-commit:3.7.0 to strip the whitespace before pushing.

@ishandhanani ishandhanani merged commit 6cdda03 into main Jul 14, 2025
9 of 10 checks passed
@ishandhanani ishandhanani deleted the tusharma/slim-sglang-runtime-build branch July 14, 2025 00:19
ln -sf $VIRTUAL_ENV/bin/* /usr/local/bin/ && \
rm -r wheelhouse
COPY --from=base /workspace/wheels/nixl/*.whl wheelhouse/
RUN uv pip install ai-dynamo --find-links wheelhouse && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do

uv pip install ai-dynamo nixl --find-links wheelhouse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update in upcoming PRs.

RUN uv pip install ai-dynamo --find-links wheelhouse && \
uv pip install ai-dynamo-runtime --find-links wheelhouse && \
uv pip install nixl --find-links wheelhouse && \
ln -sf $VIRTUAL_ENV/bin/* /usr/local/bin/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think updating PATH variable is generally better than symlink?

# Copy examples
COPY ./examples examples/
# Copy examples and set up Python path
COPY . /workspace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we copy specific folders, I think examples, tests, and benchmarks copy should do?

# Setup the python environment
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends python3-dev && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends build-essential python3-dev libnuma-dev && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is libnuma a requirement for sglang?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a required dependency for NIXL integration with the sglang backend

Copy link
Contributor

@nv-anants nv-anants Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment please in dockerfile

rm -rf /var/lib/apt/lists/* && \
uv venv $VIRTUAL_ENV --python 3.12 && \
echo "source $VIRTUAL_ENV/bin/activate" >> ~/.bashrc

# Install SGLang and related packages (sgl-kernel, einops, sentencepiece) since they are not included in the runtime wheel
# https://github.com/sgl-project/sglang/blob/v0.4.9.post1/python/pyproject.toml#L18-51
RUN uv pip install "sglang[runtime_common]>=0.4.9.post1" && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this version different than what is installed in dev container?

Copy link
Contributor Author

@nv-tusharma nv-tusharma Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version installed in the devel container is built from source commit: sgl-project/sglang#7330
The latest sglang release (0.4.9.post1) contains this fix. The reason why we can't just copy over the build from devel to runtime is there are build failures when trying to build a pip distribution wheel. There isn't a direct way to create a wheel distribution of sglang from the source build: https://docs.sglang.ai/start/install.html#method-2-from-source

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use same commit/tag in both places? ideally with an ARG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants