-
-
Notifications
You must be signed in to change notification settings - Fork 322
Make ncnn memory budget configurable #2070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(I'll fix the Pyright error ASAP.) |
CI is now passing. |
@@ -125,7 +133,7 @@ def estimate_cpu(): | |||
schema_id="chainner:ncnn:upscale_image", | |||
name="Upscale Image", | |||
description="Upscale an image with NCNN. Unlike PyTorch, NCNN has GPU support on all devices, assuming your drivers support Vulkan. \ | |||
Select a manual number of tiles if you are having issues with the automatic mode.", | |||
Select a manual number of tiles or set a memory budget limit if you are having issues with the automatic mode.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or set a memory budget limit
- This doesn't tell users how to do that.
This solution is factually incorrect. Setting a memory budget won't help at all in GPU mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow; why would it not help in Vulkan mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my bad. I didn't see the heap_budget
in min(heap_budget, vkdev.get_heap_budget() * 1024 * 1024 * 0.8)
...
Point 1 still stands though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see instructions on switching between GPU's in the backend metadata (though there is a mention of this in the README), which is why I didn't add similar instructions for this. Would you be favorable to adding a note in the backend docs metadata about making sure the integrated GPU isn't selected, along with a note about setting a memory budget? (Seems to me that documenting both of them in-app would be a UX improvement.)
@@ -46,6 +47,7 @@ export const SettingsProvider = memo(({ children }: React.PropsWithChildren<unkn | |||
const useIsFp16 = useMemoArray(useLocalStorage('is-fp16', false)); | |||
const usePyTorchGPU = useMemoArray(useLocalStorage('pytorch-gpu', 0)); | |||
const useNcnnGPU = useMemoArray(useLocalStorage('ncnn-gpu', 0)); | |||
const useNcnnBudgetLimit = useMemoArray(useLocalStorage('ncnn-budget-limit', 1024 ** 5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you make the default 1 Petabyte?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent was to pick a default that is practically equivalent to "no limit". I find it very unlikely that any usage will go over 1 PiB; 1 TiB is low enough that some users might hit it (some of my test workflows hit 600 GiB or so, and I'm not as much of a power user as some people).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdym "might hit it"? This limit is for RAM/VRAM, right? Who even has 1TB of RAM? I don't think people will run chainner of super computers soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My workstation has around 300 GiB of RAM, and I typically run chaiNNer in a VM on that workstation that has around 200 GiB of RAM assigned to it. I wouldn't be surprised if there exists some power user out there with 1 TiB of RAM (my mainboard supports up to 2 TiB), and I'm trying to avoid surprising behavior in such (admittedly rare) situations.
That said -- am I correct in assuming that your concern here is that 1 PiB looks "weird" in the GUI as the default, as opposed to having a concern about the default being "no effective additional limit"? If so, would it be OK to instead modify the backend to special-case a limit of 0 so that 0 means "no additional limit", and make 0 the default?
42b1a92
to
1dc6422
Compare
Putting this on hold until #2088 is merged. |
) | ||
else: | ||
# Empirically determined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you have so much RAM, I'm wondering if you'd be willing to run some tests. I've done some testing on NCNN VRAM estimation in the past (though only with ESRGAN), and I can already tell you that this isn't really correct, and neither is what we already have in place. I've only got 8GB VRAM, so I could never extend my tests far enough to gather fully complete data.
What I can say is that:
- Model size is not strongly correlative with VRAM usage. This is because a model can be made larger simply by performing more convolutions, which does not matter for VRAM usage because they are being done in sequence. The only thing it definitely correlates with is how much VRAM it takes to store the model itself.
- Individual weight sizes have a correlation with VRAM usage when running a model.
- Scale needs to be accounted for, which our estimation does not currently do. This estimation is based around a 4x scale, but an 8x model will blow past it.
I abandoned this back in the day because there seemed to be further factors I couldn't account for with the data I had, but maybe we can finally figure it out. Unfortunately, I seem to have deleted the set of different scale/nf/nb ESRGAN models I generated for these tests. If I can remember how I generated them all, I could send them to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theflyingzamboni I would definitely be interested. Can we split this out into its own issue so that it doesn't get lost once this PR is merged?
Superseded by #2351 |
#1867 didn't support automatic estimation of tile size, which was not great for UX. This PR adds that missing support, by allowing the user to choose a custom memory budget. Choosing a memory budget is likely to be better UX than choosing a tile size (since users are likely to have a better knowledge of how much RAM or VRAM they have, than what tile size will work with their hardware and the model they picked). This should also work fine with Vulkan (and is likely to help with this UX issue), though I wasn't able to test that.