Skip to content

Conversation

Kosinkadink
Copy link
Collaborator

@Kosinkadink Kosinkadink commented Aug 22, 2025

Implemented EasyCache natively for (almost) all supported models in ComfyUI, and created LazyCache for a more generic and universally compatible implementation of the core concepts of EasyCache.

What is it?

EasyCache is in the family of step-skipping tricks like TeaCache that reuses cached values from a previous step whenever a step is determined to be unnecessary to sample. Unlike TeaCache, EasyCache only has 1 hyperparameter: reuse_threshold, and can quickly be adjusted based on model needs. It is also 'easy' to implement in that it only cares about the inputs and outputs of the models' forward functions, not anything inside them. I implemented it by making sure (almost) every supported ComfyUI model's forward function has DIFFUSION_MODEL wrapper compatibility - no internal model code was modified.

Each modified model type was tested with an example or template workflow with and without EasyCache enabled to make sure nothing that worked before broke.

Also invented a more universal but generally worse implementation of EasyCache I dub LazyCache - forget caring about caching individual conds surrounding the models' forward function, just deal with the full combined latents before and after each step. Sometimes when one works poorly for a model, the other does better - always favor EasyCache first though. To implement it, I added a PREDICT_NOISE wrapper handled by outer_predict_noise function in CFGGuider. There was no place previously to cleanly wrap around the noising of each sampling step before/after CFG, so this wrapper 'endpoint' can be potentially used for other things.

image

Usage:

EasyCache should ideally make the composition very similar to the non-EasyCache generation, except with a quality loss.

Here is what each parameter does on the nodes:

  • reuse_threshold: value used to decide how liberally steps will be skipped. The higher the value, the more steps will be skipped (higher speedup, lower quality). The lower the value, the less steps will be skipped (lower speedup, higher quality). If you need to finetune the value and want to see what numbers the code sees, toggle verbose to True. Your console will be spammed by a bunch of log statements for each sampling step.
  • start_percent: determines what relative point in the sampling steps to begin considering skipping steps. The early steps determine overall composition and are important in maintaining image quality. If you find that the resulting image strays too far from the non-Cached version for a particular model, consider increasing start_percent.
  • end_percent: determines what relative point in the sampling steps to end considering skipping steps. The late steps are responsible for some of the finer details of the image. If you find the finer details lacking, consider decreasing end_percent.

Results:

Flux Kontext w/ 40 steps:

  • EasyCache enabled - threshold: 0.2, start_percent: 0.15, end_percent: 0.95
    • Skipped 21/40 steps (2.11x speedup)
  • LazyCache enabled - threshold: 0.2, start_percent: 0.15, end_percent: 0.95
    • Skipped 22/40 steps (2.22x speedup)
easycache_00004_

Qwen Image Edit w/ 20 steps:

  • EasyCache enabled - threshold: 0.1, start_percent: 0.2, end_percent: 0.95
    • skipped 9/20 steps (1.82x speedup)
  • LazyCache enabled - threshold: 0.2, start_percent: 0.2, end_percent: 0.95
    • skipped 9/20 steps (1.82x speedup)
easycache_00005_

WAN2.2 14B First Frame w/ 20 steps (10 per sampler):

  • EasyCache First Sampler:
    • threshold: 0.10, start_percent: 0.15, end_percent: 1.0
      • skipped 3/10 steps (1.43x speedup)
  • EasyCache Second Sampler:
    • threshold: 0.10, start_percent: 0.0, end_percent: 0.95
      • skipped 5/10 steps (2.00x speedup)
  • LazyCache First Sampler:
    • threshold: 0.12, start_percent: 0.15, end_percent: 1.0|
      • skipped 2/10 steps (1.25x speedup)
  • LazyCache Second Sampler:
    • threshold: 0.12, start_percent: 0.0, end_percent: 0.95
      • skipped 4/10 steps (1.67x speedup)
easycache_00008_.mp4

…x as test; I screwed up the math a bit, but when I set it just right it works.
… EasyCacheHolder instead of a dict wrapped by object
…ible with a greater amount of models, make end_percent work
…oved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test
@yamatazen
Copy link

Is it like TeaCache?

@Kosinkadink
Copy link
Collaborator Author

@yamatazen Yep, I just updated the PR's info with a lot more details! I published it in the middle of the night so I didn't have the energy to fill it out at that point haha

@handoniumumumum
Copy link

handoniumumumum commented Aug 22, 2025

Thank you so much, looking forward to testing this with Wan 2.2 5B. Easycache is the best option for it right now but it requires a non-native generation workflow. This will let me implement it AND refactor :)

@comfyanonymous comfyanonymous merged commit fc24715 into master Aug 23, 2025
7 checks passed
@comfyanonymous comfyanonymous deleted the easycache branch August 23, 2025 02:41
@NielsGx
Copy link

NielsGx commented Aug 23, 2025

Would this work with TeaCache ? But I guess the quality drop would be huge.
Would you say this is better than teacache and a replacement ?

Also, does also work for image diffusion models ?? (Looking at your examples)

Thank you!

Vander-Bilt pushed a commit to Vander-Bilt/ComfyUI that referenced this pull request Aug 26, 2025
* Attempting a universal implementation of EasyCache, starting with flux as test; I screwed up the math a bit, but when I set it just right it works.

* Fixed math to make threshold work as expected, refactored code to use EasyCacheHolder instead of a dict wrapped by object

* Use sigmas from transformer_options instead of timesteps to be compatible with a greater amount of models, make end_percent work

* Make log statement when not skipping useful, preparing for per-cond caching

* Added DIFFUSION_MODEL wrapper around forward function for wan model

* Add subsampling for heuristic inputs

* Add subsampling to output_prev (output_prev_subsampled now)

* Properly consider conds in EasyCache logic

* Created SuperEasyCache to test what happens if caching and reuse is moved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test

* Change max reuse_threshold to 3.0

* Mark EasyCache/SuperEasyCache as experimental (beta)

* Make Lumina2 compatible with EasyCache

* Add EasyCache support for Qwen Image

* Fix missing comma, curse you Cursor

* Add EasyCache support to AceStep

* Add EasyCache support to Chroma

* Added EasyCache support to Cosmos Predict t2i

* Make EasyCache not crash with Cosmos Predict ImagToVideo latents, but does not work well at all

* Add EasyCache support to hidream

* Added EasyCache support to hunyuan video

* Added EasyCache support to hunyuan3d

* Added EasyCache support to LTXV (not very good, but does not crash)

* Implemented EasyCache for aura_flow

* Renamed SuperEasyCache to LazyCache, hardcoded subsample_factor to 8 on nodes

* Eatra logging when verbose is true for EasyCache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core Core team dependency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants