-
-
Notifications
You must be signed in to change notification settings - Fork 323
Add per-node chain-finish cleanup functions + empty cuda cache only on chain finish #2852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"Hopefully"? Did... did you not measure whether your optimization actually made things faster? And on that note: if you do an optimization, please also report how much faster this actually makes things. As a reviewer, I also have to weigh whether the speed-up is worth the tradeoff in code simplicity/quality/maintainability. I can of course measure things myself, but having you give a reference case/number is useful.
I would suggest the following API: we have a def do_clean_up(): ...
@register(...)
def my_node(context: NodeContext):
# run once per node execution
context.add_cleanup(lambda: do_clean_up())
# run once per chain
context.add_cleanup(do_clean_up) This fixes the following issue with the current
|
I did, after I made this PR, but that was with all of my changes together. I'll test this individually. |
Turns out this is doing most of the heavy lifting for the optimizations I tested. a 2m20s processing time went down to 2m01s with just this change alone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor things.
Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>
Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>
Co-authored-by: Michael Schmidt <mitchi5000.ms@googlemail.com>
Probably needs an |
After doing some profiling, I noticed that clearing the cuda cache was taking significant time to do. And, this was happening after every upscale, which i'm pretty sure is incorrect and bad. We only need to release the memory when we're actually done with it (aka when the chain finishes).
So, I added a hook system to packages. Not sure if this is the best api for it, but packages can now define on_chain_finish callbacks to be run when the chain is finished, to do things like resource clearing.
This will hopefully bring a little speed improvement to batch upscales