-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ENH: linalg
/signal
/special
: unify approach for array API dispatching
#21012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
scipy/_lib/_array_api.py
Outdated
def get_array_subpackage_func(f_name, xp, n_array_args, root_namespace, | ||
subpackage, generic_implementations): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When generalizing the signature, I didn't think terribly carefully about the new argument names, whether they should be keyword only, etc.
root_namespace
- the name of the private module in which SciPy's implementation lies. For instance, for thespecial
functions, it's_ufunc
. Some generalization may be needed if the SciPy implementations are spread across multiple files. There are lots of possible solutions. The most obvious is to callget_array_subpackage_func
with the appropriateroot_namespace
for the givenf_name
; another possibility is to import all the SciPy implementations into a single, private module, and setroot_namespace
to the name of that module.subpackage
- e.g. 'signal', 'linalg', 'special'generic_implementations
- a dictionary of callables that defines functions that may not have direct equivalents in all other backends but can be defined in terms of otherxp
orspx
functions. We use this inspecial
to definexlogy
asx * xp.log(y)
andchdtrc
in terms ofgammaincc
(when available), for instance.
The intent was to produce a proof of concept; we can wordsmith everything to our heart's content if we prefer this approach.
f = getattr(getattr(spx, subpackage, None), f_name, None) | ||
f = f or getattr(getattr(xp, subpackage, None), f_name, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CuPy's linalg
is in cupy
, but its signal
is in cupyx
. Perhaps there is a better way to handle this, but for now, look in one, and if not found, look in the other.
scipy/_lib/_array_api.py
Outdated
|
||
@functools.wraps(func) | ||
def wrapped(*args, **kwargs): | ||
xp = array_namespace(*args[:n_array_args]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this assumes that the arrays come first as positional arguments. I'll explore solutions for passing arrays as keyword arguments in a follow-up commit if we think this has potential. If we have the developer specify the number of positional-only array arguments, a list of keyword-only arguments, and a dictionary of names of positions: keywords for those that can specified either way, some relatively simple logic can extract the arrays from *args
and **kwargs
.
Update: resolved for simple cases.
xp = array_namespace(*args[:n_array_args]) | ||
f = get_array_subpackage_func(f_name, xp, n_array_args, root_namespace, | ||
subpackage, generic_implementations) | ||
return f(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though special functions don't accept kwargs
, I tested this a bit, and it seems to work unmodified. For instance:
import cupy as xp
from scipy import signal
sig = xp.linspace(0, 1, 100)
win = xp.linspace(0.25, 0.75, 50)
signal.convolve(sig, win, mode='same').shape # (100,)
signal.convolve(sig, win, mode='full').shape # (149,)
signal.convolve(sig, win, mode='valid').shape # (51,)
Of course, this assumes that all keyword arguments are to be passed to all backends. We can add nice error message for those that aren't supported for all backends if desired.
@@ -121,6 +121,7 @@ def test_warning_calls_filters(warning_calls): | |||
os.path.join('stats', '_stats_py.py'), # gh-20743 | |||
os.path.join('stats', 'tests', 'test_axis_nan_policy.py'), # gh-20694 | |||
os.path.join('_lib', '_util.py'), # gh-19341 | |||
os.path.join('_lib', '_array_api.py'), # TBA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join('_lib', '_array_api.py'), # TBA | |
os.path.join('_lib', '_array_api.py'), # 21012 |
warnings.simplefilter("ignore") | ||
# Strange that this is needed. For now, filtering warnings | ||
# to pass tests; we'll figure out a better strategy. | ||
import cupyx.scipy.signal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is probably a better way to do this, but:
import cupyx
getattr(cupyx.scipy, 'special') # ok
getattr(cupyx.scipy, 'signal') # AttributeError: module 'cupyx.scipy' has no attribute 'signal'
import cupyx.scipy.signal
getattr(cupyx.scipy, 'signal') # ok
For now, I was just trying to get this to work. Alternatives welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x-ref cupy/cupy#8336. Hopefully CuPy will behave like SciPy at some point.
# `reversed` is for developer convenience: test new function first = less waiting | ||
@pytest.mark.parametrize('f_name_n_args', reversed(array_special_func_map.items())) | ||
@pytest.mark.parametrize('dtype', ['float32', 'float64']) | ||
@pytest.mark.parametrize('shapes', [[(10, 10), (10,)]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this makes the assumption that a valid first arg is 2d and the second should be 1d, which is true for many linalg
functions. This can be generalized, or there is no reason why a single test needs to be parameterized over all functions. It's still more compact than translating all existing tests.
args_xp = [xp.asarray(arg[()], dtype=dtype_xp) for arg in args_np] | ||
|
||
res = f(*args_xp) | ||
ref = xp.asarray(f(*args_np), dtype=dtype_xp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can include kwargs
to be tested in the array_special_func_map
.
_SCIPY_ARRAY_API = os.environ.get("SCIPY_ARRAY_API", False) | ||
array_api_compat_prefix = "scipy._lib.array_api_compat" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_SCIPY_ARRAY_API = os.environ.get("SCIPY_ARRAY_API", False) | |
array_api_compat_prefix = "scipy._lib.array_api_compat" |
scipy/_lib/_array_api.py
Outdated
|
||
|
||
def support_alternative_backends(f_name, n_array_args, root_namespace, | ||
subpackage, generic_implementations): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is convenient in special
to loop over function names, etc., to add array API support for functions. For other sub-packages, it might be better to create a version of this function that is applied as a decorator. I'll explore that idea in the next commit.
linalg
/signal
: leverage scipy.special
approach for array API dispatchinglinalg
/signal
/special
: unify approach for array API dispatching
The next three commits refactor a bit further but show how we could do the dispatching with a decorator. Even though we'd have to add decorated functions to the tests manually, it's a lot simpler overall because this approach wouldn't require changes to |
The latest commit resolves #21012 (comment) for the case in which array arguments are specified as consecutive positional or keyword arguments, which is the most common case. data-apis/array-api-compat#153 is an issue if the arguments are specified in an unusual way. At least one of the property-based array API tests will fail because |
@@ -280,6 +281,7 @@ def eig(a, b=None, left=False, right=True, overwrite_a=False, | |||
return w, vr | |||
|
|||
|
|||
@_xp_dispatch(['a', 'b'], 'linalg') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all it takes to get the dispatching working as long as the array arguments are all consecutive at the beginning of the signature (very common). They can be positional only or positional/keyword. I think this will work for all linalg functions. I found a handful of signal functions that don't fit this mold, and that's fine - we just need to look closer to know whether it's worth generalizing or whether these can be treated ad hoc.
Currently, this assumes the xp-backend function has the same signature as the SciPy function, so it doesn't emit helpful errors if this is not the case and unsupported arguments are passed. There are plenty of ways to deal with this, but I'd rather not go too much deeper into this without initial feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks pretty good to me! The diff is pleasingly small. Agreed that we do not need to check that other libraries' functions pass all of our tests. Importantly, I like that this seems quite flexible - we can choose which functions are included, and add generic implementations at a later point, quite easily.
|
||
_generic_implementations = {} | ||
|
||
array_special_func_map = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array_special_func_map = { | |
array_linalg_func_map = { |
|
||
_generic_implementations = {} | ||
|
||
array_special_func_map = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array_special_func_map = { | |
array_signal_func_map = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably will just make it array_func_map
or so. Or TBH might remove this and use the decorator for everything? The reason we didn't do that in special
is that there's not a Python function to decorate, but for the other modules, it would probably be nicer?
Thanks @lucascolley. Do you want all functions to support |
Let's start with just emitting nice errors. We should figure out how to start supporting other keywords in an agnostic way, though. Can we somehow only support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The philosophy is that we do not need to test the quality of the alternative backend implementations to the same standards as we test the SciPy implementations; rather, we really just need to test the "plumbing": for typical cases, when we pass an alternative-backend array into one of these dispatched functions, is type in == type out and do the numerical values agree with those of NumPy/SciPy?
FTR, I disagree with this. The existing SciPy test suite serves as a contract to what the the SciPy functionality actually is. If we are to dispatch to other backends, we should IMO keep the contract and be explicit of what we xfail/skip.
I think that it would be a good idea to convert the tests eventually1, but while the behaviour is experimental and opt-in via an env variable, surely we are fine to punt on it for now. Any users who are going out of their way to help us test this experimental dispatching should be smart enough to send bug reports to the appropriate library, rather than to us. Footnotes
|
For the dispatch itself, I've two high-level comments first.
How about using the fact that a module is a namespace itself, and do something along the lines of
|
I disagree. |
Okay, let's split this discussion out to gh-21031. |
IIUC you're referring to the However, this map is just what I did for
No, if we're not allowed to make any modifications, it will not work for all APIs. But "I think this will work for all linalg functions. I found a handful of signal functions that don't fit this mold, and that's fine - we just need to look closer to know whether it's worth generalizing or whether these can be treated ad hoc." (#21012 (comment).). We can move quickly through the low-hanging fruit, identify the functions that don't fit the mold, and make the generalizations as needed. If we find that we need to add support for functions like your As for |
So for import cupy as xp
from scipy.ndimage import convolve1d
input = [2, 8, 0, 4, 1, 9, 9, 0]
weights = [1, 3]
output = xp.empty(8, dtype=int)
convolve1d(xp.asarray(input), weights=xp.asarray(weights), output=output)
print(output, type(output))
# [14 24 4 13 12 36 27 0] <class 'cupy.ndarray'> However, if the user were to pass an invalid argument for An example of a function that wouldn't work for all arguments as written is For the sake of argument, I can also imagine examples of signatures in
Again, if we find a large class of functions like this that we want to dispatch, we either ensure that the SciPy versions of the functions convert to NumPy arrays, or can generalize the code to handle keyword-only array arguments. |
Here's what I meant: #21091
gh-21091 has the whole of ndimage, and it's explicit which ones need what. Not sure how to simplify it further. |
OK, so it looks like you're using
OK, since the first argument of all of those functions is an array, I think that functions wrapped per this PR would probably pass all the same tests with CuPy as those wrapped by gh-21091. The difference is just that if the user were to pass arrays with different backends, this PR would not raise an error immediately in IIUC, gh-21091 does not convert other CPU arrays (e.g. torch CPU) to NumPy, run the SciPy function, and convert back. Is that right? |
Reference issue
gh-20772
gh-19260
What does this implement/fix?
As discussed in gh-20772 and gh-19260, this is an alternative approach to dispatching
linalg
andsignal
functions to the native backend of the input arrays.Rather than adding separate approaches for dispatching to each SciPy subpackage, it proposes maintaining a single dispatch mechanism based on the existing code from
scipy.special
(with minor generalizations - see diff here).Also, rather than translating the original test suites to use array API calls, this would only add property-based tests to confirm that the functions return similar results with each backend as they do for NumPy/SciPy. The philosophy is that we do not need to test the quality of the alternative backend implementations to the same standards as we test the SciPy implementations; rather, we really just need to test the "plumbing": for typical cases, when we pass an alternative-backend array into one of these dispatched functions, is type in == type out and do the numerical values agree with those of NumPy/SciPy?
Of course, by adjusting the definition of "typical" in the tests, we can make the tests arbitrarily strong - for example, strong enough such that a function that passes the generic property based test would certainly pass all of SciPy's function-specific tests. (Note that this idea is orthogonal to the one above; we can certainly leverage the existing dispatch framework but not add these property-based tests, if we prefer to translate the test suites.)1
The intent is to reduce author and review time for new functions and to minimize the diff/be non-invasive (i.e. to keep the original code readable/avoid introducing bugs in existing functions). We may find that additional generalizations are needed as we add more functions. That's OK. The intent here is to take care of low hanging fruit, and we can generalize when the time comes.
Additional information
The first commit moves the key
_support_alternative_backends
functions fromspecial
to_lib/_array_api.py
and generalizes it slightly. GitHub shows this as a large diff (remove from one place, add to another), but all that has changed is adding some additional arguments to generalize fromspecial
to any sub-package. I'd suggest comparing the diff manually.The next two commits add support and tests for a few functions in
signal
andlinalg
, respectively.I'll make some additional notes inline.
Footnotes
One might ask why we didn't do these things for
stats
. Let's separate this into implementation and testing:Implementation: so far, we've been translating functions which don't have equivalents in other backends (for instance,
ttest_ind
is not implemented in JAX, CuPy, or Torch) and can be written efficiently in pure Python, array API standard calls (plus some dispatchedscipy.special
function calls). So in short, we can't just dispatch, and there would be little advantage to doing so. Instead, we translate.Testing: TBH, we could have left the existing pure-NumPy tests alone and only added property-based tests. However, the difference in
stats
compared to say,special
, is that because we are providing the whole implementation for other backends rather than just dispatching, it's no longer sufficient just to check the basic plumbing. We really do need to ensure full test coverage to ensure that all lines of code work as intended for all backends. It's possible to make the property-based tests sufficiently strong, and it might have been a bit simpler/faster to go this route... I'm not sure. The truth is that we just didn't try it. One advantage to translating the test functions is that it's given us a chance to improve SciPy's tests. ↩