Skip to content

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented May 3, 2024

Reference issue

Towards gh-20544

What does this implement/fix?

Adds array API support to scipy.stats.rankdata.

Additional information

No need to review this yet. While working on it, I found that it needs several things that aren't covered by array_api_compat yet (e.g. repeat, moveaxis, NumPy asarray with copy=False), and it would also really benefit from the array API supporting take_along_axis (data-apis/array-api#177) and diff (data-apis/array-api#791). Also, without put (or preferably put_along_axis), this needed a second argsort. As-is, this would introduce a substantial performance regression.

@mdhaber mdhaber added scipy.stats enhancement A new feature or improvement array types Items related to array API support and input array validation (see gh-18286) labels May 3, 2024
@github-actions github-actions bot added scipy._lib CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure labels May 3, 2024
@mdhaber
Copy link
Contributor Author

mdhaber commented May 23, 2024

array-api-compat (and maybe the array API itself) isn't quite ready for this. We'll re-asses after the next array-api-compat release.

@dbolser
Copy link

dbolser commented Oct 8, 2024

How come this is closed?

@lucascolley
Copy link
Member

This PR had to add some shims for things which have only recently been added to the standard like diff, moveaxis and take_along_axis. Once array-api-compat supports those features we won't need the shims here so the PR will be cleaner.

In theory we could merge this with the shims, but I suppose it was judged better to prioritise other things from a developer-churn perspective.

@dbolser
Copy link

dbolser commented Oct 10, 2024

Thanks for the explanation. I see why the PR isn't going to be included (or specifically, isn't efficient to include given the bigger architectural picture).

I guess I'm looking for the issue to keep track of the stats.rankdata request, as a couple of few other stats functions are waiting for that (#20544).

@lucascolley
Copy link
Member

If you keep watching this PR, we'll try to make some noise here when work is resumed on rankdata (in this PR or elsewhere).

@mdhaber
Copy link
Contributor Author

mdhaber commented Mar 3, 2025

I think this is possible to revive now that the standard has take_along_axis and diff. We can use put_along_axis with NumPy and do a second argsort for other backends.

@dbolser
Copy link

dbolser commented Mar 7, 2025

I think this is possible to revive now that the standard has take_along_axis and diff. We can use put_along_axis with NumPy and do a second argsort for other backends.

Thanks @mdhaber !

Looking forward to mannwhitneyu 😛

@mdhaber
Copy link
Contributor Author

mdhaber commented Mar 7, 2025

I'm guessing it will be ready in 1.17 if all goes well.

@mdhaber mdhaber reopened this Apr 22, 2025
@mdhaber
Copy link
Contributor Author

mdhaber commented Apr 23, 2025

Failures are unrelated.

@mdhaber mdhaber marked this pull request as ready for review May 9, 2025 10:40
Copy link
Member

@j-bowhay j-bowhay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few optional comments but otherwise all good I think

Probably should merge main and run CI as imagine there might be changes needed for #22949 etc

@mdhaber
Copy link
Contributor Author

mdhaber commented May 12, 2025

Thanks @j-bowhay. Would you test locally and see if you're getting an error with JAX (when you remove it from the list of incompatible backends)? I haven't seen that one before.

@mdhaber
Copy link
Contributor Author

mdhaber commented May 13, 2025

@j-bowhay Looks like I needed to set jax_jit=False; test is passing with JAX now!

@lucascolley lucascolley added this to the 1.16.0 milestone May 13, 2025
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Matt!

Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
@lucascolley lucascolley merged commit e9d03b0 into scipy:main May 13, 2025
34 of 36 checks passed
@lucascolley lucascolley changed the title ENH: stats.rankdata: add array API support ENH: stats.rankdata: add array API standard support May 13, 2025
@mdhaber
Copy link
Contributor Author

mdhaber commented May 13, 2025

Thanks @j-bowhay @lucascolley for the reviews!

@lucascolley
Copy link
Member

lucascolley commented May 13, 2025

That just emphasizes the need for this function; it's not entirely trivial to get 100% right.

I somewhat agree, although it really depends on how widespread use of np.swapaxes with identical and/or negative axes is. We surely can't introduce helpers into xpx for every single quirk of NumPy. If we hit it again (or better, some other project does), that is probably a strong reason to go ahead with the implementation.

@mdhaber
Copy link
Contributor Author

mdhaber commented May 13, 2025

for every single quirk of NumPy

Sure. Just thought I'd note that the failure with identical axes in moveaxis is for all backends and required by the standard. Also repeated axes will be common for arrays that happen to be 1D or wherever you're swapping axes to simplify implementation and axis happens to already be there.

johnmdusel pushed a commit to johnmdusel/scipy that referenced this pull request May 13, 2025
* ENH: stats.rankdata: draft array API support

* TST: stats.rankdata: add array API support

* TST: stats: fix failing tests

* MAINT: stats.rankdata: modernizations

* MAINT: stats.rankdata: updates per self-review / CI results

* MAINT: stats.rankdata: revisions per review

* MAINT: stats.rankdata: run tests with jax_jit=False

* Update scipy/stats/_stats_py.py

Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>

---------

Co-authored-by: Lucas Colley <lucas.colley8@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure enhancement A new feature or improvement scipy._lib scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants