Skip to content

[Feature] Partial Deserialization in rust router implementation. #7298

@hnyls2002

Description

@hnyls2002

Checklist

Motivation

In the current SGLang router implementation (written in Rust), we support:

  • Regular routing strategies: cache-aware, random, and round-robin
  • Prefill-decode (PD) disaggregated routing: random and power-of-two (Po2) based

Previously, incoming requests were deserialized from raw bytes into dictionaries (maps) to extract minimal fields (e.g., stream). However, with the addition of PD routing requirements, fields like bootstrap_port and bootstrap_room need to be injected into the request object. As a result, the router now deserializes the full request into a fully typed struct.

This shift raises performance concerns regarding deserialization overhead, especially under high QPS.

Goal

Evaluate and implement an optimized solution that balances:

  • Performance overhead
  • Code maintainability
  • Flexibility for routing logic extensions

Task

  • Benchmark and compare the following approaches:
    • Full deserialization of typed request objects
    • Partial deserialization (extract only required fields)
    • Byte-based routing (minimal/no deserialization)
  • Profile latency and CPU cost in each scenario (especially under load)
    • Propose and implement a best-practice design based on findings: e.g., use partial deserialization for fast path (stream detection, method detection) and fallback to full deserialization only when needed (e.g., bootstrap injection)

Related resources

sample bootstrap injection

fn inject_bootstrap_fields(
        &self,
        json: &mut serde_json::Value,
        prefill: &EngineInfo,
        batch_size: Option<usize>,
    ) -> Result<(), String> {
        let obj = json
            .as_object_mut()
            .ok_or("Request body is not a JSON object")?;

        // Generate bootstrap room
        let room_id = rand::random::<u64>();

        match batch_size {
            Some(n) => {
                // Batch format
                obj.insert(
                    "bootstrap_host".to_string(),
                    serde_json::json!(vec![prefill.url.as_str(); n]),
                );
                obj.insert(
                    "bootstrap_port".to_string(),
                    serde_json::json!(vec![prefill.bootstrap_port; n]),
                );
                obj.insert(
                    "bootstrap_room".to_string(),
                    serde_json::json!(vec![room_id; n]),
                );
            }
            None => {
                // Single format
                obj.insert(
                    "bootstrap_host".to_string(),
                    serde_json::json!(prefill.url.as_str()),
                );
                obj.insert(
                    "bootstrap_port".to_string(),
                    serde_json::json!(prefill.bootstrap_port),
                );
                obj.insert("bootstrap_room".to_string(), serde_json::json!(room_id));
            }
        }

        Ok(())
    }

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions