Skip to content

SGLang Integration #8

@merrymercy

Description

@merrymercy

Nice project!

I believe this project can greatly benefit from https://github.com/sgl-project/sglang. You can try to use SGLang as a backend for local models.

  • The fast JSON decoding feature can help you force additional constraints and probably help the nested JSON schemas. You can find the example here.
  • The RadixAttention feature can help you reuse the KV cache for the shared prefix. You can find one example on parallel decoding here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions