Skip to content

sglang version upgraded from 0.4.3 to 0.4.5, performance degraded by 8% #5223

@ch-tiger1

Description

@ch-tiger1

I used a single-node H20-141G to deploy deepseek-R1, and used the vllm benchmark file benchmark_serving.py to perform stress tests in different concurrent scenarios. I found that under the same parameter configuration, the input throughput performance deteriorated by 8% and the output throughput performance deteriorated by 12% after the version upgrade.

  • sglang 0.4.3
python3 -m sglang.launch_server \
        --model /home/model/DeepSeek-R1/ \
        --tp 8 \
        --trust-remote-code \
        --enable-dp-attention \
        --port "9001" \
        --host 0.0.0.0 \
        --enable-metrics
  • sglang 0.4.5
python3 -m sglang.launch_server \
        --model /home/model/DeepSeek-R1/ \
        --tp 8 \
        --dp 8 \
        --trust-remote-code \
        --enable-dp-attention \
        --port "9001" \
        --host 0.0.0.0 \
        --enable-metrics
  • performance test
python3 benchmark_serving.py \
        --backend sglang \
        --model $model \
        --tokenizer 'deepseek-tokenizer' \
        --dataset-name "random" \
        --host $ip \
        --port $port \
        --random-input-len 1024 \
        --random-output-len 1024 \
        --ignore-eos \
        --max-concurrency $concurrency  \
        --num-prompts $prompts \
        --seed 12345 \
        --trust-remote-code
  • result

Image

I don't understand what causes the performance degradation, can you explain it?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions