Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852

joeyballentine · 2024-05-16T17:05:31Z

After doing some profiling, I noticed that clearing the cuda cache was taking significant time to do. And, this was happening after every upscale, which i'm pretty sure is incorrect and bad. We only need to release the memory when we're actually done with it (aka when the chain finishes).

So, I added a hook system to packages. Not sure if this is the best api for it, but packages can now define on_chain_finish callbacks to be run when the chain is finished, to do things like resource clearing.

This will hopefully bring a little speed improvement to batch upscales

RunDevelopment · 2024-05-17T13:36:30Z

This will hopefully bring a little speed improvement to batch upscales

"Hopefully"? Did... did you not measure whether your optimization actually made things faster?

And on that note: if you do an optimization, please also report how much faster this actually makes things. As a reviewer, I also have to weigh whether the speed-up is worth the tradeoff in code simplicity/quality/maintainability. I can of course measure things myself, but having you give a reference case/number is useful.

Not sure if this is the best api for it, but packages can now define on_chain_finish callbacks to be run when the chain is finished, to do things like resource clearing.

I would suggest the following API: we have a def add_cleanup(callback: Callable[[], None]) -> None method to NodeContext. This will allow nodes to add cleanup functions that run when the chain is done. Cleanup callbacks are stored in a set, so nodes can decide to run a cleanup action once or once per node execution by doing:

def do_clean_up(): ...

@register(...)
def my_node(context: NodeContext):
    # run once per node execution
    context.add_cleanup(lambda: do_clean_up())
    # run once per chain
    context.add_cleanup(do_clean_up)

This fixes the following issue with the current set_on_chain_finish API:

set_on_chain_finish always runs cleanup actions even if no nodes from that package were executed.
Cleanup actions were not allowed to assume the dependencies of the package.

joeyballentine · 2024-05-17T13:52:26Z

"Hopefully"? Did... did you not measure whether your optimization actually made things faster?

I did, after I made this PR, but that was with all of my changes together. I'll test this individually.

joeyballentine · 2024-05-17T15:50:58Z

Turns out this is doing most of the heavy lifting for the optimizations I tested. a 2m20s processing time went down to 2m01s with just this change alone.

RunDevelopment

Just some minor things.

backend/src/api/node_context.py

backend/src/process.py

Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>

RunDevelopment · 2024-05-17T16:18:51Z

Probably needs an from __future__ import annotations somewhere.

Add chain finish hook + empty cuda cache on chain finish

f861c0d

joeyballentine requested a review from RunDevelopment May 16, 2024 17:16

change implementation

41ea93b

joeyballentine changed the title ~~Add chain finish hook + empty cuda cache on chain finish~~ Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish May 17, 2024

RunDevelopment reviewed May 17, 2024

View reviewed changes

backend/src/api/node_context.py Outdated Show resolved Hide resolved

backend/src/process.py Outdated Show resolved Hide resolved

backend/src/process.py Outdated Show resolved Hide resolved

joeyballentine and others added 3 commits May 17, 2024 12:15

Update backend/src/api/node_context.py

76c4c8a

Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>

Update backend/src/process.py

970a0af

Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>

Update backend/src/process.py

7aadbde

Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>

ruff fixes

68b2dd9

RunDevelopment approved these changes May 17, 2024

View reviewed changes

joeyballentine merged commit 3ae2726 into main May 17, 2024

joeyballentine deleted the chain_hook_cuda_empty branch May 17, 2024 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852

Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852

Uh oh!

joeyballentine commented May 16, 2024

Uh oh!

RunDevelopment commented May 17, 2024

Uh oh!

joeyballentine commented May 17, 2024

Uh oh!

joeyballentine commented May 17, 2024

Uh oh!

RunDevelopment left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RunDevelopment commented May 17, 2024

Uh oh!

Uh oh!

Uh oh!

Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852

Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852

Uh oh!

Conversation

joeyballentine commented May 16, 2024

Uh oh!

RunDevelopment commented May 17, 2024

Uh oh!

joeyballentine commented May 17, 2024

Uh oh!

joeyballentine commented May 17, 2024

Uh oh!

RunDevelopment left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RunDevelopment commented May 17, 2024

Uh oh!

Uh oh!