kvcache: Enable SWA to retain additional entries #11611

jessegross · 2025-07-31T19:53:21Z

Models that use sliding window attention can only resume a sequence from the cache if it falls within the saved windows. This works well if the next message picks up where the old one left off. However, it generally prevents a partial prefix match unless the entire conversation falls within the sliding window.

This can be a problem with reasoning models where the traces are supposed to be removed from future messages, forcing the entire history to be re-evaluated.

This change allows models to specify that a larger amount of the history be retained in memory, to allow more partial resumption. It still respects the window that the model was trained on for token generation.

Models that use sliding window attention can only resume a sequence from the cache if it falls within the saved windows. This works well if the next message picks up where the old one left off. However, it generally prevents a partial prefix match unless the entire conversation falls within the sliding window. This can be a problem with reasoning models where the traces are supposed to be removed from future messages, forcing the entire history to be re-evaluated. This change allows models to specify that a larger amount of the history be retained in memory, to allow more partial resumption. It still respects the window that the model was trained on for token generation.

mxyng approved these changes Jul 31, 2025

View reviewed changes

jessegross merged commit 4183bb0 into main Jul 31, 2025
8 checks passed

jessegross deleted the jessegross/swa branch July 31, 2025 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kvcache: Enable SWA to retain additional entries #11611

kvcache: Enable SWA to retain additional entries #11611

Uh oh!

jessegross commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

kvcache: Enable SWA to retain additional entries #11611

kvcache: Enable SWA to retain additional entries #11611

Uh oh!

Conversation

jessegross commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!