-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Handle literal in meta_from_array #6731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @mathause |
Thanks! This does seem to fix the issue (I tested locally but did not run the xarray test suite over it). Testing for the error message may be brittle, though (but I also don't know how to do this better). |
I generally agree with that, but it seems that such issues have no better ways to be handled. Due to the lack of a more concise way to do it, we already have similar handling in Lines 128 to 134 in c631d32
With the above said, I'm happy to change the proposed solution for something more resilient if anyone can suggest an alternative. |
Agreed with everyone here that this seems brittle but I don't immediately see a better way to handle it. |
I just tried the xarray tests with this branch and this fixes 4 of our 5 failures. I don't understand the remaining failure, yet. |
What if we just special case when Admittedly I don't have a good handle on the use case here. More context would be very welcome. |
Please keep us posted.
Who is |
I found the problem - with this workaround the dtype of import numpy as np
import dask.array as da
bar = da.zeros_like(da.array("bar"), dtype=bool)
bar.dtype
# returns dtype('<U3')
bar.compute().dtype
# returns bool For empty string and float it is correct da.zeros_like(da.array(""), dtype=bool).dtype
# returns bool
da.zeros_like(da.array([1.2, 3.4]), dtype=bool).dtype
# returns bool which is why 4 of the 5 tests pass. As for the use case - we try to compare bar & np.array(False) which then returns TypeError: operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'bitwise_and'>, '__call__', dask.array<zeros, shape=(), dtype=<U3, chunksize=(), chunktype=numpy.ndarray>, array(False)): 'Array', 'ndarray' |
Do I understand correctly the The problem stems from da.array("jjj")._meta == np.array("")
# -> array(False)
da.array(["jjj"])._meta == np.array("")
# -> array([], dtype=bool)
da.array("")._meta == np.array("")
# -> array(True) Given you already test for if any(...) and meta.dtype.kind == "U" to ensure the error is raised. |
|
For the case at hand you're correct the test isn't strictly necessary. However, I've been taking the approach of not generalizing fixes, meaning if this exception exists in a completely unrelated case (e.g., a non-literal array that may happen to raise the same error message) we won't just assume we're handling it correctly when we probably aren't. This helps prevent silent errors that are much harder to be found in the future.
Yes, I think this is a better approach than comparing against |
In the case above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for all the detailed explanations!
I left some suggestions. Turns out this also needs to be done for (byte-)strings ("S"
).
dask/array/utils.py
Outdated
] | ||
] | ||
) | ||
and meta.dtype.kind == "U" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and meta.dtype.kind == "U" | |
and meta.dtype.kind in "SU" |
to handle the case
da.zeros_like(da.array(b"u"), dtype=bool).compute()
dask/array/tests/test_array_utils.py
Outdated
assert meta_from_array("").dtype == np.array("").dtype | ||
assert meta_from_array("", dtype="bool").dtype == np.array([], dtype="bool").dtype | ||
assert meta_from_array("", dtype="int").dtype == np.array([], dtype="int").dtype | ||
assert meta_from_array("", dtype="float").dtype == np.array([], dtype="float").dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also test "str"
, u""
, and u"str"
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the filled string cases (e.g., "str"
and u"str"
) actually happening in your tests? I'm asking this because I would generally expect meta
to be reduced to an empty array (empty literal in this case), if that's not actually happening we may have a different issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, the test you're suggesting is from before meta
getting reduced, so it makes sense.
dask/array/utils.py
Outdated
any( | ||
[ | ||
s in str(e) | ||
for s in [ | ||
"invalid literal", | ||
"could not convert string to float", | ||
] | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any( | |
[ | |
s in str(e) | |
for s in [ | |
"invalid literal", | |
"could not convert string to float", | |
] | |
] | |
) | |
any( | |
s in str(e) | |
for s in [ | |
"invalid literal", | |
"could not convert string to float", | |
] | |
) |
dask/array/utils.py
Outdated
[ | ||
s in str(e) | ||
for s in [ | ||
"unexpected keyword argument", | ||
"is an invalid keyword for", | ||
"Did not understand the following kwargs", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ | |
s in str(e) | |
for s in [ | |
"unexpected keyword argument", | |
"is an invalid keyword for", | |
"Did not understand the following kwargs", | |
] | |
s in str(e) | |
for s in [ | |
"unexpected keyword argument", | |
"is an invalid keyword for", | |
"Did not understand the following kwargs", |
I addressed your suggestions @mathause , could you check if everything works now? |
Now all xarray tests pass (locally) so this should be good to go from my side. |
Thanks @mathause for confirming. From my side it's complete too, appreciate it if @TomAugspurger or @jakirkham have chance to review/merge at some point. |
Thanks all! |
Attempt at fixing #6680 (comment) .