Skip to content

Conversation

slin1237
Copy link
Collaborator

@slin1237 slin1237 commented Aug 2, 2025

Motivation

The current router system passes the HTTP client as a parameter through every RouterTrait method, leading to:

  • Verbose method signatures that are difficult to maintain
  • PDRouter creating a redundant HTTP client that goes unused
  • Poor scalability when adding new dependencies (would require changing all method signatures)
  • Unnecessary parameter passing overhead throughout the codebase

This PR refactors the router system to use proper dependency injection through an AppContext pattern, making the code cleaner, more maintainable, and easier to extend.

Modifications

1. Introduced AppContext Pattern

  • Created AppContext struct to serve as a dependency container holding the HTTP client, router configuration, and concurrency limiter
  • Modified AppState to contain both the router and the shared context
  • Provides a single place to manage all cross-cutting concerns

2. Updated RouterFactory

  • Changed create_router signature to accept &Arc<AppContext> instead of &RouterConfig
  • Factory now extracts dependencies from context and passes them to router constructors
  • Maintains clean separation between dependency management and router creation

3. Improved Router Constructors

  • Router: Added client: reqwest::Client parameter and field to store the injected client
  • PDRouter: Added client: reqwest::Client parameter, renamed internal http_client to client for consistency
  • Removed redundant client creation in PDRouter (saving ~4 lines of code and a potential failure point)

4. Simplified RouterTrait Interface

  • Removed client: &Client parameter from all 10 RouterTrait methods
  • Methods now have cleaner signatures focused on their actual responsibilities
  • Improved API usability for router implementations

5. Updated Implementations

  • Modified all router methods to use self.client instead of receiving it as a parameter
  • Updated background load monitoring tasks to use the injected client
  • Ensured consistent client usage across all router types

6. Cleaned Up Server Route Handlers

  • Updated handlers to access client through state.context.client
  • Removed client parameter passing from all router method calls
  • Resulted in cleaner, more readable route handler code

7. Fixed All Tests

  • Updated test helpers to create AppContext for tests
  • Modified all test files to use the new factory signature
  • All tests pass successfully with the new architecture

Accuracy Test

This PR only affects the router infrastructure and dependency injection pattern. It does not modify any model-side code, kernels, or model architecture. No accuracy testing is required.

Benchmark & Profiling

This refactoring is performance-neutral as it:

  • Uses the same HTTP client configuration with identical connection pooling settings
  • Maintains the same request routing logic and load balancing algorithms
  • Reduces parameter passing overhead (minor performance improvement)
  • Does not introduce any new allocations or computational overhead

The change is purely architectural and should not impact throughput or latency.

Checklist

  • Format your code according to the Code Formatting with Pre-Commit.
  • Add unit tests as outlined in the Running Unit Tests.
    • All existing tests updated and passing
    • Test coverage maintained at the same level
  • Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
    • Updated all relevant code comments
    • Method signatures in trait documentation updated
  • Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
    • Not applicable - architectural refactoring only
  • For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
  • Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

@slin1237 slin1237 requested a review from ByronHsu as a code owner August 2, 2025 21:11
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've refactored the router system to implement a proper HTTP Dependency Injection pattern. The primary goal was to address verbosity, redundancy, and scalability issues caused by passing the HTTP client through every RouterTrait method. This change introduces an AppContext to centralize dependencies, streamlines router construction, and simplifies method signatures across the router system, resulting in cleaner, more maintainable, and extensible code without impacting performance or accuracy.

Highlights

  • Centralized Dependency Management: I've introduced an AppContext struct that acts as a central container for shared dependencies like the HTTP client, router configuration, and concurrency limiter. This simplifies dependency management across the application.
  • Updated Router Factory: The RouterFactory now accepts AppContext to create router instances, ensuring that all necessary dependencies are properly injected into the router constructors.
  • Improved Router Constructors: Both Router and PDRouter constructors have been updated to directly receive the reqwest::Client as a parameter, storing it internally. This eliminates redundant client creation and ensures consistent client usage.
  • Simplified RouterTrait Interface: I've significantly simplified the RouterTrait interface by removing the client: &Client parameter from all its methods. This makes method signatures cleaner and more focused on their core responsibilities.
  • Consistent Client Usage: All router methods and background tasks now use the internally stored self.client instead of receiving it as a parameter, leading to more consistent and maintainable code.
  • Cleaner Server Route Handlers: Server route handlers have been cleaned up; they no longer explicitly pass the HTTP client to router method calls, accessing it through the state.context.client where needed.
  • Comprehensive Test Updates: All test helpers and test files have been updated to align with the new AppContext and dependency injection patterns, ensuring full test coverage and functionality are maintained.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a dependency injection pattern using an AppContext to manage shared dependencies like the HTTP client. This is a great architectural improvement that cleans up the RouterTrait interface by removing the client parameter from its methods, making the code more maintainable and scalable. The changes are consistently applied across the router implementations, server handlers, and tests. My main feedback is to extend the refactoring to some of the inherent helper methods within the router implementations to fully adopt the new pattern.

@@ -22,29 +22,34 @@ use tokio::spawn;
use tracing::{error, info, warn, Level};

#[derive(Clone)]
pub struct AppState {
pub router: Arc<dyn RouterTrait>,
pub struct AppContext {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason of this change?

@slin1237 slin1237 merged commit 828a4fe into main Aug 3, 2025
19 of 21 checks passed
@slin1237 slin1237 deleted the slin/http-client branch August 3, 2025 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants