Skip to content

Conversation

JeremyRand
Copy link
Contributor

This is a useful option on systems with strong CPU and weak GPU, where PyTorch is nontrivial to install (e.g. ppc64le systems).

Currently CPU inference is keyed on which NCNN package is installed; a future PR could allow CPU inference even when the Vulkan NCNN package is installed.

@JeremyRand
Copy link
Contributor Author

(Will fix the lint errors ASAP)

@JeremyRand JeremyRand force-pushed the ncnn-cpu branch 4 times, most recently from cc7d0b6 to ff8c3ef Compare June 15, 2023 07:26
This is a useful option on systems with strong CPU and weak GPU, where
PyTorch is nontrivial to install (e.g. ppc64le systems).

Currently CPU inference is keyed on which NCNN package is
installed; a future PR could allow CPU inference even when the Vulkan
NCNN package is installed.
@JeremyRand
Copy link
Contributor Author

Linters are passing now; sorry that took so many force-pushes.

@joeyballentine
Copy link
Member

joeyballentine commented Jun 15, 2023

Did you test this? There's no way this should work. Ncnn_vulkan is my fork of NCNN that has multiple changes we rely on. Model loading shouldn't even work properly using the regular package

@JeremyRand
Copy link
Contributor Author

JeremyRand commented Jun 15, 2023

I upscaled a few images and it worked fine here AFAICT. Also NCNN was a lot faster than PyTorch on POWER9 once I fixed some POWER9-specific optimization flags in the NCNN build scripts; upscaling my test image on POWER9 took 1:05 with NCNN compared to 1:32 with PyTorch (that's including Electron startup/shutdown overhead, so the actual inference speed difference is a bit larger). (Planning to submit those optimization patches to upstream NCNN soon.) I have no idea how NCNN compares to PyTorch on x86 in terms of CPU inference speed, but I'm guessing NCNN doesn't do terribly there.

@joeyballentine
Copy link
Member

Huh, well in that case it sounds like it's fine. I wanna test some stuff myself first though

@JeremyRand
Copy link
Contributor Author

Sure, test as much as you like. It's entirely possible that there's some broken edge case that my testing failed to tickle (I'm not familiar enough with the chaiNNer codebase to know what edge cases to test for here, so I just did a simple upscale operation a few times.)

@JeremyRand
Copy link
Contributor Author

Worth noting that NCNN's CPU inference is supposedly very well-optimized for ARM64, so if someone has an ARM64 machine that they could test this on (and compare to PyTorch speed), that might be interesting.

@joeyballentine
Copy link
Member

joeyballentine commented Jun 15, 2023

Haven't been able to test this yet, but I just realized: we should be able to run on CPU using ncnn_vulkan, there should be no reason to need to use the other package. We should be able to just set the right device and turn Vulkan compute off, i think.

@JeremyRand
Copy link
Contributor Author

JeremyRand commented Jun 15, 2023

Yes, I was intending to submit such a feature as a follow-up PR. There are a few reasons why some users might prefer to use upstream NCNN:

  • Easier to build from source since there are fewer dependencies.
  • Updated more frequently.
  • More eyes on the code.

In particular, I couldn't get ncnn_vulkan to build on ppc64le, and while that is probably fixable, I don't think it's desirable to force users to install ncnn_vulkan if they don't intend to use Vulkan. The chaiNNer changes needed to support upstream NCNN are tiny, as shown in this PR. But yes, I am supportive of supporting CPU inference with ncnn_vulkan as well.

@joeyballentine
Copy link
Member

I didn't consider needing to build from source. That makes sense.

My biggest concern is any kind of breaking changes in the upstream one. I am currently unable to merge the upstream changes into ncnn_vulkan (too many merge conflicts) so ncnn_vulkan is going to be stuck on its current version until i get around to redoing all the necessary changes in a new branch.

But, it's probably fine, i don't think anything we use would have an api change and even so CPU NCNN is going to not be frequently used.

So yeah, it's probably fine as is

@theflyingzamboni
Copy link
Collaborator

The lack of load_param_mem and load_model_mem and associated hackiness with the temporary files isn't ideal. I don't know why they left those out of the pybinds in the official repo.

@JeremyRand
Copy link
Contributor Author

Agreed that the tempfile workaround is a bit annoying, and that it would be desirable for upstream NCNN to add the missing functions. That said, the workaround is only a few lines of code, and isn't particularly bad for readability, so I think it's an acceptable cost of supporting upstream.

@joeyballentine
Copy link
Member

We could submit a PR with those pybinds added and wait for an official update from that

@JeremyRand
Copy link
Contributor Author

Is there a particular commit from ncnn_vulkan containing those pybinds that I could cherry-pick onto upstream master and submit to them?

@joeyballentine
Copy link
Member

@theflyingzamboni did that part so it's up to him on whether he wants to do that or if he'd rather you do it -- i can check later what commit it was

@JeremyRand
Copy link
Contributor Author

FWIW I have no experience with pybinds, so it may go more smoothly if he submits it.

@theflyingzamboni
Copy link
Collaborator

I'll take a look soon.

@joeyballentine
Copy link
Member

Even though this isn't an ideal implementation, it'll take way too long to get the stuff we'd need added to the official ncnn, so i think we're fine just using this for now. It won't affect regular users.

@joeyballentine joeyballentine merged commit 6fd553e into chaiNNer-org:main Jun 24, 2023
@JeremyRand JeremyRand deleted the ncnn-cpu branch December 5, 2023 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants