-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ENH: sparse: first pass at array API standard compat #20190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
array_api
compat
A first "fix" would probably be moving the folder under |
Indeed 👍. SciPy is largely an array-consumer library, so having this at the top-level would be very confusing. Keeping this as close to the array-producer part of |
@willow-ahrens, I believe you'd mention that you were working on getting |
We've been building it array api-compliant, but haven't begun testing with the Array API suite. I'll try to remember to update when we get to the test suite. |
Definitely interested @willow-ahrens - I left out operations that don't already have standard implementations. It seems like certain operations like |
array_api
compatarray_api
compat
@lucascolley tried to add CI but I don't think it'll run without approval? Not sure it's worth really delving into this ATM unless we really think it'll help move things along, especially given the status of 1D arrays. I also definitely don't think non-trivial implementations should go here since array-api-tests do not test for correctness (and even for things like |
sounds good. As I said, no rush on the CI check. FYI, we are able to skip certain CI workflows with tags in commit messages, see http://scipy.github.io/devdocs/dev/contributor/continuous_integration.html#skipping. That might be useful here, e.g. if you don't need to build the docs and would only like GitHub Actions to run. |
maybe a next step tomorrow, thanks @lucascolley :) |
First up: https://data-apis.org/array-api/latest/API_specification/generated/array_api.where.html#where This appears to be necessary for solving at least part of dask/dask#10980 (comment) Interestingly, the array api version does not actually cover all the use-cases I'd be interested in (or rather the ones where numpy currently has coverage): from numpy import array_api as npa
import numpy as np
X = np.random.randn(20, 20)
npa.where(X>0, 10, X) # errors out because 10 is not an array as specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting @ilan-gold, thanks for sharing.
First a comment on the module structure: putting this into scipy/sparse/_array_api/
(private) would avoid exposing a lot of things publicly. At least at the start, using it via __array_namespace__
would make sense. And it's possible (desirable?) that it could live directly in sparse
in the future - like in numpy 2.0
npa.where(X>0, 10, X) # errors out because 10 is not an array as specified
Allowing Python scalars is a minor convenience thing that is numpy-specific. Using where(X>0, asarray(10), X)
works.
Oh very cool!!! |
CI now running under |
[skip cirrus] [skip circle]
[skip cirrus] [skip circle]
thanks for getting this set up @ilan-gold ! I just pushed a few commits to get the CI job over the line and we're now at:
so still a long way to go, but nice to be able to see what is missing! Hopefully #21197 (comment) will help us make some further progress! |
Thanks @lucascolley !! |
Reference issue
Towards #18915
What does this implement/fix?
This begins the works towards an
array_api
compliant implementation. I've take some first steps here just to see what works and what doesn't and document how to do this. Of note is that most of the current errors arise from either indexing problems or lack of 1d support. See also #20128Additional information
In terms of current failing tests on account of missing names/functions (i.e., not 1D issues), we have a sampling here, although this is probably the majority:
At the end of the day, I'm not sure if this is worth doing or not. I will remember to come to the working group to discuss in two weeks! That being said, a lot of the missing functions probably could be implemented and this might serve as a nice roadmap for new features that would make QOL better for many. For example, one of the blockers for dask integration I found was the
keepdims
arg being passed through tocsr_matrix.sum
but if we could rely on the array api implementation here, we could probably ignore it or handle it better with the advent of 1D arrays.