Skip to content

Conversation

fzyzcjy
Copy link
Collaborator

@fzyzcjy fzyzcjy commented Dec 28, 2024

Motivation

Two motivations:

  1. Part 3 of [Feature] Proposal: Releasing SGLang memory when idle #2583
  2. To allow easy testing of part 2 of [Feature] Proposal: Releasing SGLang memory when idle #2583 (also [Feature] (Willing to PR) Avoid KV cache occupying GPU memory when not used #2542)

Currently I have only tested correctness, not performance yet (e.g. does it cause memory copy? will it be slow?). So, in this PR, just consider it as a tool for testing (just like get_weights_by_name is for testing). I will check performance later.

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@merrymercy merrymercy merged commit fd28640 into sgl-project:main Dec 28, 2024
15 checks passed
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants