Skip to content

Conversation

slin1237
Copy link
Collaborator

Motivation

This PR implements Kubernetes service discovery for SGL Router, allowing it to automatically discover and manage worker pods based on label selectors. The router can dynamically add worker URLs when healthy pods are discovered and remove them when pods are unhealthy or deleted.
Key Features

  • Automatic Worker Discovery: Auto-detects worker pods with matching labels
  • Dynamic Worker Management: Adds/removes pods based on health status
  • Namespace Support: Can watch pods in a specific namespace or cluster-wide (in case router becomes a cluster scope ingress gateway)
  • Configurable Worker URL Generation: Uses pod IP and configurable port

Modifications

  • Created a new service_discovery.rs module for Kubernetes integration
  • Added configuration options to the Router struct and constructor
  • Added CLI parameters for service discovery configuration
  • Used the Kubernetes API to watch pod events in real-time

Usage

python -m sglang_router.launch_router \
    --service-discovery \
    --selector app=sglang-worker \
    --service-discovery-port 8000 \
    --service-discovery-namespace default

Checklist

@slin1237 slin1237 requested review from merrymercy and zhyncs April 29, 2025 02:34
@slin1237 slin1237 force-pushed the slin/service-discovery branch from b8cc29f to 180ed24 Compare April 29, 2025 02:44
@slin1237 slin1237 force-pushed the slin/service-discovery branch from 180ed24 to 47760c7 Compare April 29, 2025 02:46
Copy link
Collaborator

@ByronHsu ByronHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is very useful for k8s deployment

@slin1237 slin1237 merged commit 1468769 into main Apr 29, 2025
4 checks passed
@slin1237 slin1237 deleted the slin/service-discovery branch April 29, 2025 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants