Skip to content

Conversation

mobicham
Copy link
Contributor

Motivation

Currently, the GemLite cache is not saved, which makes the warm-up/capture very slow if a cache was not provided.

Modifications

  • Added a util function in layers/torchao_utils to handle cache saving.
  • Added the cache saving call after each capture.

Caching after each capture ensures that the cache is frequently updated. We could also move the caching somewhere else: hence making this as a draft PR.

@mobicham mobicham marked this pull request as ready for review December 30, 2024 12:30
@merrymercy merrymercy merged commit a29dd95 into sgl-project:main Dec 30, 2024
15 checks passed
XiaotongJiang pushed a commit to XiaotongJiang/sglang that referenced this pull request Jan 3, 2025
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants