-
-
Notifications
You must be signed in to change notification settings - Fork 947
Allow copying in the format cupy_array[:] = numpy_array
#2079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow copying in the format cupy_array[:] = numpy_array
#2079
Conversation
The __setitem__ implementation from cupy.ndarray checks for an empty slice and if the value being passed is an instance of numpy.ndarray to make a copy of it. That can is a very useful feature in circumstances where we want to create a copy of a numpy.ndarray without requiring mechanisms to identify the type of an array as a CuPy array.
@okuta How do you think? |
This change has a few benefits compared to the original proposal: * Its format is compatible with NumPy's copy of empty sliced arrays, as formats must now match * Avoid accidental overwriting of cupy_array when formats don't match * The current CUDA stream may be used implicitly This change implies on cupy_array being created beforehand. Eventually, we can benefit from numpy/numpy#13046 and this change combined.
After thinking about this PR overnight, I realized there's a few improvements that could be made to make this change safer and better. Please take a look at the latest commits as well. |
Requirements: * numpy/numpy#13046 * cupy/cupy#2079
It may be helpful also to see what we envision a use case for such a change is. Please refer to dask/dask-glm@e39aafd#diff-6fa4900cefa9ee15a64539f9fcbf1639R182. Assume Note that for this particular case, we need |
…opy-numpy-instance
After a conversation with @anaruse, he said one of the concerns here was due to broadcasting. From our understanding, the issue arises when the NumPy array has a different stride, when it's a view, for example. The latest commits address that matter. @niboshi could you confirm we understood your concerns correctly and if there are any others that we should address? |
Just extending a bit on the original reason why we want to have this special case: with the I understand the implications this change has, so I'd be fine if we could have this in 6.0.0 as opt-in-only, through a mechanism such as an environment variable, just as NumPy does current for |
Disucssed with @asi1024 . In this particular case (dask/dask-glm@e39aafd#diff-6fa4900cefa9ee15a64539f9fcbf1639R182), it seems that after #2165 is merged, the code should work because As for mixed-array-type case you mentioned above, could you elaborate why you need such cases supported? |
Thanks @niboshi and @asi1024 for checking on this, and for letting me know about #2165. This would certainly solve that particular case from dask-glm. For the mixed arrays, consider a slightly different example in the same dask-glm PR, dask/dask-glm@e39aafd#diff-6fa4900cefa9ee15a64539f9fcbf1639R240. This function receives a I hope this clarifies a bit the sort of use case we have for this. As for the "NumPy-fallback" feature you mentioned, would it work to wrap a third-party functions (like SciPy) as well? For example, could we pass CuPy arrays to a pure-CPU implementation in SciPy and ensure we get back a CuPy array? I can't think of a way to do that, but if we could, that would be great and probably solve this as well. |
In this case, as you know which argument may be of differerent types, I think you can write
I don't think such cases will be supported either (@asi1024 How do you think?). |
I don't think that works, we would have to assume that we know |
Maybe my last comment wasn't very clear, when I say |
My assumption was If this assumption is incorrect and |
Sure, this would work. But the point of having We understand the potential harm for allowing implicit copies, so having this special copy |
Ah, OK, I missed the existence of other libraries. |
No worries, there are indeed many details and different use cases involved, it's not too difficult to get confused. :) Please let me know what you guys discuss further, once again, we're open to different solutions as well, provided that we can keep the main benefits of |
I'm sorry, maybe it's not clear yet.
So, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
LGTM except this comment.
cupy/core/core.pyx
Outdated
else: | ||
raise ValueError( | ||
"copying a numpy.ndarray to a cupy.ndarray by empty slice " | ||
"assignment must ensure arrays exact same shape and dtype") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use single quotes for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing that, thought that flake8 would catch this, but I realized it doesn't support that, at least without third-party plugins. Just pushed a fix for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Thank you for the fix. |
Successfully created a job for commit 0e329c7: |
Jenkins CI test (for commit 0e329c7, target branch master) failed with status FAILURE. |
Looks like the failure is in chainer-doc, probably not related to this PR. |
Looks so. Retrying. |
Successfully created a job for commit 0e329c7: |
Jenkins CI test (for commit 0e329c7, target branch master) succeeded! |
Thanks. LGTM! |
cupy_array[:] = numpy_array
Awesome! Thanks @niboshi for the review and merging! |
By the way, since this now requires explicit opt-in, do you think it could be backported to the next CuPy 6 release? |
…y-instance Allow copying in the format `cupy_array[:] = numpy_array`
Yes, sure. Thank you for pointing out! |
The setitem implementation from cupy.ndarray checks for an empty slice and
if the value being passed is an instance of numpy.ndarray to make a copy of it.
That can is a very useful feature in circumstances where we want to create a
copy of a numpy.ndarray without requiring mechanisms to identify the type of an
array as a CuPy array.
This resolves #593.