Skip to content

Conversation

rjeb
Copy link
Contributor

@rjeb rjeb commented May 28, 2022

Addresses #20326

np.unique previously had it's functionality changed so NaN values would be treated as non-unique.
This PR puts the functionality into the kwarg equal_nans(default: True).

@rjeb
Copy link
Contributor Author

rjeb commented May 28, 2022

@mattip I have some changes from something I was trying before that I think addresses #20326.

Co-authored-by: Matti Picus <matti.picus@gmail.com>
@mattip
Copy link
Member

mattip commented May 29, 2022

@asmeurer is this sufficient to bring np.unique(a, equal_nans=False) into compliance with the array API?

Bikeshedding on equal_nans, should it be maybe collapse_nans instead? They are not really "equal". What do others think?

Co-authored-by: Matti Picus <matti.picus@gmail.com>
@seberg seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Jun 1, 2022
@mattip
Copy link
Member

mattip commented Jun 1, 2022

Bikeshedding on equal_nans ...

This got a green light in the developer meeeting, so LGTM.

@mattip mattip merged commit 6cada27 into numpy:main Jun 1, 2022
@mattip mattip added the 09 - Backport-Candidate PRs tagged should be backported label Jun 1, 2022
@seberg
Copy link
Member

seberg commented Jun 1, 2022

Actually, a little oops there. My/our argument was that we already have equal_nans, but that is not true. The parameter is called equal_nan on the other functions, so maybe that would be better?

@asmeurer
Copy link
Member

asmeurer commented Jun 1, 2022

The array API specifies that nans are not equal. See https://data-apis.org/array-api/latest/API_specification/generated/signatures.set_functions.unique_all.html#signatures.set_functions.unique_all for instance.

We would need to invert this flag in the array_api. The unique functions in the array API have new names so it wouldn't be backwards incompatible to change the default for them.

@mattip
Copy link
Member

mattip commented Jun 1, 2022

@asmeurer NumPy has chosen in np.unique to not follow the array API, rather to reduce all nans to a single value. This PR is to allow np.array_api.unique_all to use np.unique(..., equal_nan=False) under the hood. My question is if this PR is sufficient for the implementation of np.array_api.unique_all, which was the original request in #20326

@asmeurer
Copy link
Member

asmeurer commented Jun 1, 2022

Yes, I believe it should be. We just need to add equal_nans=False to the wrappers in the array API.

@charris charris changed the title ENH: Add equals_nans kwarg to np.unique Fixes #20326 ENH: Add equals_nans kwarg to np.unique Jun 2, 2022
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
62 - Python API Changes or additions to the Python API. Mailing list should usually be notified.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants