-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
BUG: stats: Fix broadcasting in the rvs() method of the distributions. #5526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Side note: I would really like to eliminate the use of the |
fb488bb
to
4a5bd6a
Compare
Yep, sub-optimal design but no good way to get rid of |
Change is nicely local, called from only one place. So should be able to get it reviewed/merged without too large risks right before a release. Existing test coverage for non-scalar loc/scale/size is very thin at the moment though, needs expanding. |
Right, the code in this PR leaves the use of
Working on it... |
4a5bd6a
to
435d658
Compare
Updated with some tests. I'm still thinking about some more thorough tests. Suggestions are welcome! This is still WIP. The |
I quickly drafted a test based on the stackoverflow thread, which is sensitive to the order of broadcasting and drawing variates: https://github.com/ev-br/scipy/tree/pr/5526 In master, this fails all over. With this PR, I only see a couple of failures in This test is just a suggestion. EDIT: Also array-valued shapes: ev-br@fda4518 |
Thanks! It occurred to me that another simple test would be to use |
Actually, this isn't true, because some of the
Calling that function several times with scalars will not produce the same variates as calling it once with broadcast arguments or with |
|
It would also be helpful to have a test explicitly comparing the output of |
Regarding the This can be done in a separate PR, of course :-). I just think it would be nice to change both things in one release if possible. |
|
||
def test_all_rvs_broadcast_shape(): | ||
# This test only checks the *shape* of the value returned by rvs(). | ||
for dist in dists: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that dists
is a hard coded list which does not include all distributions IIRC. (I actually wonder why this is so, but that's off-topic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm moving the tests over to test_continuous_basic.py
and test_discrete_basic.py
, where they will use the lists distcont
and distdiscrete
.
(I actually wonder why this is so [...])
I do that a lot when working with the distributions code. :)
435d658
to
d56fe59
Compare
Even without the python 3 test failures, this is still WIP. I haven't touched @ev-br I didn't use your commit, but putting the test "kernel" in |
@rgommers: In case it isn't obvious: this PR won't be ready in time for 0.17. Making the rvs() arguments work correctly might end up requiring that all the distribution methods properly broadcast. In theory, that's supposed to be the case already (see, for example, http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#broadcasting). In practice, not so much. |
So we can remove the 0.17.0 label from gh-2069? |
Yup, I just did. |
Any time scale for completing this? Just hit the bug too. |
@pwaller I've been on a bit of a hiatus from scipy development, but I plan to get this into 0.18. |
In the meantime, what is the best workaround which won't break once this has landed? Is there any? I'm trying to generate many rvs with many values of parameters, e.g: Edit: in the end I settled with |
d56fe59
to
beb7a7a
Compare
Despite my hiatus in scipy contributions, I haven't forgotten about this. I think the tentative plan is to release 0.18 this month. Is there a more concrete timeline for 0.18? |
The milestone page lists June 14, but judging based on the number of PRs marked for 0.18, I'm skeptical :) I don't know of anything more concrete than that, at least. |
Thanks Eric. Time to dive back into this... |
e7bd655
to
e7a41ca
Compare
e7a41ca
to
ffed817
Compare
Updated. |
This would need a mention in the release notes, in big bold letters. |
@ev-br I added a note to the release notes about the bug fix. It is in the "Backwards incompatible changes" section, mainly because there are cases with arguments that are incompatible with a given size where the function used to return values and now it raises an error. My guess is that case will be pretty rare in the wild (but I admit it is just a guess). I refrained from using "big bold letters". 😄 |
That's big enough and bold enough for me :-). This PR looks good to me now. I'm going to merge it before branching unless there are further comments. |
Needs a rebase (again). Here's something I don't understand:
In scipy 0.17:
Notice that the random draws are the same for |
The class rv_generic was updated to handle broadcasting of the arguments to the rvs() method. The following distributions required additional custom changes to support broadcasting in rvs(): levy_stable, pearson3, rice, truncnorm, hypergeom, planck, randint. The function `numpy.broadcast_to` is used in several places. This function was added to numpy in version 1.10, so a copy was added to _lib/_numpy_compat.py to support older version of numpy. Closes scipygh-2069.
…e notes. Also added subheadings to the section of the release notes on backwards incompatible changes.
Looks like
Those outputs should be the same. I'll fix that now. |
ac350eb
to
761ac90
Compare
Ouch. |
Updated with fix for |
LGTM, merging. Thanks Warren! |
This is a fix for #2069
I haven't added any new tests yet, but the existing stats test suite passes on my system (Mac OS X 10.9.5, python 2.7.10, numpy 1.10.1). Here are some examples using the updated
rvs
method:Argument with incompatible shapes (e.g.
norm.rvs([1, 2, 3], [4, 5])
,gamma.rvs(array([2.0, 5.0, 10.0, 15.0]), size=(2,2))
) will raise a ValueError. The compatibility test is the same as that used by the random variate generators innumpy.random
. The rule is that, ifsize
is given, it determines the shape of the output--broadcasting can not change the output size.