Skip to content

ENH/DEP: stats.find_repeats/tiecorrect/etc.: add array API support or deprecate?  #21077

@mdhaber

Description

@mdhaber

Working on gh-20544, I wonder for certain functions whether we want to
a) add array API support,
b) declare legacy / deprecate + remove them, or
c) leave them as they are.
If the long-term goal is to make SciPy largely array-API compatible, I assume we will want to do one of the first two things.

I think it would be easiest to consider just one function at a time. Perhaps when there's a consensus about one (or if we reach an impasse), we move on to the next.

For today:
find_repeats is trivial to implement in terms of the array API:

from array_api_compat import array_namespace
import array_api_strict as xp

# If there are NaNs, `scipy.stats.find_repeats` treats them as distinct.
# `unique_counts` does, too, so no special handling is required
def find_repeats(x):
    xp = array_namespace(x)
    unique, counts = xp.unique_counts(x)
    repeat_mask = counts > 1
    return unique[repeat_mask], counts[repeat_mask]  # would return in result object 

find_repeats(xp.asarray(x))

But because it is trivial, I wonder if it makes sense to keep around. For reference, python-api-inspect saw no uses of find_repeats outside tests in the 2019 blog post data. Popular functions are found dozens or hundreds of times. (Could re-run the numbers, but I'm guessing they are representative enough.) On Stack Overflow, all the appearances of find_repeats seem to be people defining a different function called find_repeats.

I think it could be vectorized using the techniques of _rankdata, but the output would most naturally be a ragged array, so I'm not sure if we want to come up with a way of representing that information in an array API compatible way.

What should we do here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    array typesItems related to array API support and input array validation (see gh-18286)deprecatedItems related to behavior that has been deprecatedenhancementA new feature or improvementscipy.stats

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions