Skip to content

Conversation

lucascolley
Copy link
Member

@github-actions github-actions bot added scipy.cluster scipy._lib scipy.spatial CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks labels Nov 8, 2024
except TypeError:
coerced_xp = array_namespace(xp.asarray(3))
array = coerced_xp.asarray(array, dtype=dtype, copy=copy)
array = xp_in.asarray(array)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xp_in is only used to deal with array-like input. It might be cleaner to omit the argument and only deal with array-likes when array_namespace(array) is np_compat.

Copy link
Contributor

@mdhaber mdhaber Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'm also not seeing the difference between xp_out and xp - now the function doesn't even use xp, right?

So I would suggest leaving just the argument xp, which specifies the desired output array type. And then I think we can call this function xp_asarray because it is a convenience wrapper of xp.asarray that is more accepting of array inputs and has some other bells and whistles like check_finite.

In addition to xp, I think it's important (given our use-cases) to add an option here that allows the user to specify the output device. I'd like to be able to use this function to convert a CuPy array to a NumPy array, but IIUC, that won't work unless we specify the device:

If device is None and x supports DLPack, the output array must be on the same device as x.

There are some places we'll want to specify a device manually when we call this function - like if we're converting from another library to NumPy as then back to the array library, we should transfer the result back to whatever device it came from. Still, I think that there should be a default device that goes to whatever device xp.asarray would go to if the input were array-like. (Presumably that can be determined by default_device?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts here @lucascolley? Does it sound ok to try trimming down to xp only and adding a device argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in particular coercing array-likes if needed when array_namespace(array) is np_compat

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I didn't do exactly that. But IIUC your point was that if the input is array-like but NOT an array of any sort, it needs to be turned into an array before we can do much with it. To keep the logic simple, I turn it into a NumPy array and then I'm done with that special case. That might mean that it gets copied into a NumPy array and possibly copied again to get to the final array type, but I have no problem with that if the object is not an array to begin with.

Can you take a look at the function logic in #113, keeping in mind that xp now represents the desired output type (and the input can be any array or array-like)? (Device transfers don't work yet... but I've considered the need for device transfers while writing it, so it should be easy to add.)

I think you had trouble here because cluster currently relies heavily on implicit conversions with __asarray__. That will need quite a bit of work, so it is probably best to do that in a separate PR. Maybe in this PR, let's get through all the modules that should be pretty easy, and when we run into serious trouble we can do it in a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you meant gh-113?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: data-apis/array-api-strict#91 (comment)

I don't think I have the capacity to push it through immediately, but maybe someone can follow through on the idea.

If others can weigh in on these comment below, I can help out here.

except TypeError:
coerced_xp = array_namespace(xp.asarray(3))
array = coerced_xp.asarray(array, dtype=dtype, copy=copy)
array = xp_in.asarray(array)
Copy link
Contributor

@mdhaber mdhaber Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'm also not seeing the difference between xp_out and xp - now the function doesn't even use xp, right?

So I would suggest leaving just the argument xp, which specifies the desired output array type. And then I think we can call this function xp_asarray because it is a convenience wrapper of xp.asarray that is more accepting of array inputs and has some other bells and whistles like check_finite.

In addition to xp, I think it's important (given our use-cases) to add an option here that allows the user to specify the output device. I'd like to be able to use this function to convert a CuPy array to a NumPy array, but IIUC, that won't work unless we specify the device:

If device is None and x supports DLPack, the output array must be on the same device as x.

There are some places we'll want to specify a device manually when we call this function - like if we're converting from another library to NumPy as then back to the array library, we should transfer the result back to whatever device it came from. Still, I think that there should be a default device that goes to whatever device xp.asarray would go to if the input were array-like. (Presumably that can be determined by default_device?)

Comment on lines +325 to +326
actual = _asarray(actual, xp_in=xp, xp_out=np)
desired = _asarray(desired, xp_in=xp, xp_out=np)
Copy link
Contributor

@mdhaber mdhaber Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the actual and desired arrays should be of the same type if the assertion is to pass. If JAX and NumPy are the only array libraries that get here, and both of them work with np.testing, why is a conversion necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array-api-strict gets here too, and those arrays are incompatible with np.testing when __array__ is removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think the reason I asked originally is because if JAX uses np.testing, we wouldn't want to convert it to a NumPy array if it's on another device. So I think i'll change this to convert to NumPy only if it's array-api-strict?

xp_out = array_namespace(array) if xp_out is None else xp_out
xp_in = array_namespace(array) if xp_in is None else xp_in

if is_numpy(xp_out) and is_numpy(xp_in):
# Use NumPy API to support order
if copy is True:
array = np.array(array, order=order, dtype=dtype, subok=subok)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should do something when this function is called with options that aren't supported by the array library? https://github.com/scipy/scipy/pull/18668/files#r1841587120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure maintenance Items related to regular maintenance tasks scipy._lib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants