Support Load Format `runai_streamer` #4966

HermitSun · 2025-04-01T06:30:29Z

Motivation

Resolve #4822.

Modifications

Support loading safetensors weights with runai_streamer. This can be enabled by adding the option --load-format runai_streamer when launching.

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

stevapple · 2025-04-01T13:35:29Z

python/sglang/srt/server_args.py

@@ -424,6 +424,7 @@ def add_cli_args(parser: argparse.ArgumentParser):
                "bitsandbytes",
                "layered",
                "remote",
+                "runai_streamer",


You may want to add a description for it below.

Thanks for the reminder, I've already added it.
As for why I didn't add a comment for remote, it's because I think the logic of runai_streamer and remote can be merged. I'll try to refactor this logic a bit later.

brayden-hai · 2025-09-05T00:37:46Z

Hi @HermitSun I'm wondering if the existing SGLang already supports runai streamer, as I was able to install it but the performance was still not as good as expected. I'm interested in the S3 use case. Right now I am using the basic MP model loader, I wonder if you have compared this performance with the MP loader in #7277

feat: support load format runai streamer

36d531c

HermitSun requested review from merrymercy, Ying1123, hnyls2002, zhyncs, ispobock and ByronHsu as code owners April 1, 2025 06:30

HermitSun and others added 2 commits April 1, 2025 14:33

fix: lint

036535e

chore: support for all devices

a4d5e93

stevapple reviewed Apr 1, 2025

View reviewed changes

HermitSun and others added 2 commits April 2, 2025 14:47

docs: add comments

2d78e38

Merge branch 'main' into runai-streamer

ea5340e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Load Format `runai_streamer` #4966

Support Load Format `runai_streamer` #4966

Uh oh!

HermitSun commented Apr 1, 2025 •

edited

Loading

Uh oh!

stevapple Apr 1, 2025

Uh oh!

HermitSun Apr 2, 2025

Uh oh!

brayden-hai commented Sep 5, 2025

Uh oh!

Uh oh!

Support Load Format runai_streamer #4966

Are you sure you want to change the base?

Support Load Format runai_streamer #4966

Uh oh!

Conversation

HermitSun commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

stevapple Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

HermitSun Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

brayden-hai commented Sep 5, 2025

Uh oh!

Uh oh!

Support Load Format `runai_streamer` #4966

Support Load Format `runai_streamer` #4966

HermitSun commented Apr 1, 2025 •

edited

Loading