Skip to content

Conversation

whybeyoung
Copy link
Collaborator

@whybeyoung whybeyoung commented Feb 17, 2025

Motivation

  1. The community lacks a good example of distributed inference in a K8s environment.
  2. The community lacks examples of containerized environments combined with high-speed networks like RoCE.
  3. K8s is the most popular open-source infrastructure platform, but it lacks best practices for integration with sglang, which is one of the most popular recent open-source inference projects.

@zhaochenyang20 zhaochenyang20 enabled auto-merge (squash) February 18, 2025 02:06
@zhaochenyang20 zhaochenyang20 merged commit c51dc2c into sgl-project:main Feb 18, 2025
11 of 12 checks passed
@@ -0,0 +1,339 @@
# Deploying a RoCE Network-Based SGLANG Two-Node Inference Service on a Kubernetes (K8S) Cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @whybeyoung would you like to submit a PR to lws as a RDMA example, I think this values a lot to our users for RDMA-based quick start, if not, I can submit one. Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! You can do it on your side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants