Skip to content

server base: ChatScheduler requests sglang openai-compatible server base #1721

@chenhaiq

Description

@chenhaiq

This issue is a follow up task from #1698

The performance is poor because every request need to call broadcast_pyobj together with async_generate.

Need to find a way to call async_generate without broadcast_pyobj.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions