-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Description
Problem
SciPy adopted uarray
to support a multi-dispatch mechanism with the goal being: no need for OpenMP or GPU kernels etc. in the codebase. See motivation below for more concrete discussion.
SciPy currently supports this through the scipy.fft
module.
There are other scipy modules that will benefit from uarray
backend, and later extending the usage through libraries like CuPy
(cupyx.scipy) and Dask
(dask.array) etc.
Proposed Modules
-
scipy.ndimage
ENH: ndimage: uarray based backend support #14356
Note:cupyx.scipy.ndimage
has almost all (except a couple: geometric_transform, watershed_ift) functions implemented whiledask-image
is currently less complete.dask-image
has a different namespace structure currently, but Import all ndimage functions into a single namespace dask/dask-image#198 plans to address this.- Filters
- Fourier filters
- Interpolation
- Measurements
- Morphology
-
scipy.linalg
[WIP] ENH: uarray support for linalg.decomp #14407
TODO: Add more comprehensive note on cross library availability of functions later. For now, a quick look tells me that not all functions are available incupy
ordask
.- Basics
- Eigenvalue Problems
- Decompositions
- Matrix Functions
- Matrix Equation Solvers
- Sketches and Random Projections
- Special Matrices
- Low-level routines
-
scipy.special
ENH: Adduarray
support forscipy.special
#15665
Note: These are element-wise functions; those can be made to work with dask fairly easily later on. CuPy already has some of the functions.
Obviously once SciPy support is added, these libraries should be updated to make use of uarray, similar to what was done here.
Motivation for uarray
See gh-10204 comment
The protocol not covering things like array creation functions is one thing, but there's a more important limitation I think: it is specific to "types of arrays". So if you want to create functions with the same API for GPU arrays (CuPy, PyTorch), distributed arrays (Dask), sparse arrays (scipy.sparse, pydata/sparse), then it works. But what if you want to provide an alternative implementation for ndarrays? You simply cannot do that. Pyfftw, mkl-fft and pypocketfft all work on regular numpy arrays. So letting the numpy array carry around information about what implementation to use is just fundamentally not going to work. Instead, it's the library that must be able to say "hey, here's an implementation (perhaps for specific types)", and a mechanism for either automatic or user-controlled selection of which implementation/backend to use.
See gh-13965 comment
For example, a CUDA-based tensor object from a deep learning framework could invoke CuFFT. I think (not 100% certain) that this also allows you to slot in your own preferred FFT library as a backend even for plain-old numpy ndarray objects. We used to have multiple FFT backends selected at build time, but it was difficult to add new ones, and not easy to support incompatibly-licensed FFT libraries like the popular FFTW. I think this new multidispatch mechanism allows that to be slotted in at runtime.
See gh-13965 comment
It's possible it will extend to scipy.linalg, as it also has some need to swap out backends like that, but it probably won't be a widely used pattern across all of scipy.