Supported Stats for Speculative Decoding for Chat API #1

FrankLeeeee · 2025-06-09T16:31:51Z

Motivation

When we run benchmark/mtbench/bench_sglang_eagle.py, this will use the /generate API by default, however, it does not work well for models which require chat APIs such as Llama4, as a result, the acceptance length is extremely low for these models.

Thus, I updated this part of code for two purposes:

enable chat api in SGLang frontend
enable speculative decoding stats for chat api

Modifications

The results seem good.

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

This reverts commit 22a52b3.

FrankLeeeee added 2 commits June 9, 2025 04:57

support chat endpoint in frontend

7ada509

supported spec decoding stats for chat api

070b403

FrankLeeeee merged commit 22a52b3 into nv_eagle3 Jun 9, 2025
1 check failed

FrankLeeeee added a commit that referenced this pull request Jun 29, 2025

Revert "Supported Stats for Speculative Decoding for Chat API (#1)"

3ab17ff

This reverts commit 22a52b3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Supported Stats for Speculative Decoding for Chat API #1

Supported Stats for Speculative Decoding for Chat API #1

Uh oh!

FrankLeeeee commented Jun 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Supported Stats for Speculative Decoding for Chat API #1

Supported Stats for Speculative Decoding for Chat API #1

Uh oh!

Conversation

FrankLeeeee commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

Uh oh!

Uh oh!

FrankLeeeee commented Jun 9, 2025 •

edited

Loading