Skip to content

Conversation

henryhmko
Copy link
Contributor

Motivation

Exposing max total num tokens, as requested in #1900.

Modifications

Usage: Accessed through get_max_total_num_tokens in Runtime or Engine API.

I'll add examples to Native APIs doc if this looks good.

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@henryhmko
Copy link
Contributor Author

Updated documentation in Native APIs.

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merrymercy merrymercy enabled auto-merge (squash) November 22, 2024 09:57
@merrymercy merrymercy disabled auto-merge November 22, 2024 23:10
@merrymercy merrymercy merged commit c35cd1f into sgl-project:main Nov 22, 2024
10 of 13 checks passed
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants