-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Closed
Labels
Description
I guess it is debatable as to whether this is a feature request (i.e. it should support masked array) or a bug (it doesn't).
Reproducing code example:
Simple example:
rnd = np.random.RandomState(seed=54)
x = rnd.uniform(high=2*np.pi,size=1000)
r2 = rnd.normal(size=len(x))
f = np.sin(x) + 0.1 * r2
# To demonstrate, make some of f NaNs based on r2 then use a masked array.
# We could just make a masked array but I want to *know* it's not working
f[np.abs(r2)>2] = np.nan
print(np.sum(np.isnan(f)))
ff = np.ma.masked_invalid(f)
stat,*_ = scipy.stats.binned_statistic(x,ff) # binned_statistic calls binned_statistic_dd
print(stat)
Of course, you could do:
stat2,*_ = scipy.stats.binned_statistic(x,ff,statistic=np.nanmean)
but it is slower. In Jupyter:
%timeit scipy.stats.binned_statistic(x,ff)
%timeit scipy.stats.binned_statistic(x,ff,statistic=np.nanmean)
193 µs ± 5.28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
669 µs ± 26.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Around 350% slower (with the same ~ 4% variation)
Error message:
N/A
Scipy/Numpy/Python version information:
print(scipy.__version__, numpy.__version__, sys.version_info)
1.5.0 1.18.5 sys.version_info(major=3, minor=8, micro=3, releaselevel='final', serial=0)
(Basically the latest in Anaconda I think)