-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
[ENH] circvar calculated simply as 1-R #5747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
scipy/stats/morestats.py
Outdated
[0, 1], 0 standing for no variance, and 1 for a large variance. | ||
|
||
References | ||
----- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Title underline too short" on this line, and one more below.
This is not backwards compatible, right? Wouldn't adding an option be a better idea? |
No, it's not compatible. The
All of them define the circular variance as Is a compatibility option still needed in this case? |
a085fe6
to
e6011cb
Compare
@@ master #5747 diff @@
======================================
Files 235 235
Stmts 43219 43225 +6
Branches 8143 8145 +2
Methods 0 0
======================================
+ Hit 33601 33605 +4
- Partial 2591 2593 +2
Missed 7027 7027
|
How compatible (conceptually) |
I don't remember any details for how transformation of circular data works. If this is just a renormalization of the variance into radians, then this should be an option not a backwards incompatible redefinition. |
mirca said:
josef-pkt said:
This is how circular variance is defined in the literature, with values between 0 and 1, and independent on the units.
It's not only a renormalization. After normalisation, the PR's I would strongly suggest to keep the new definition of |
3 years ago I was working on a circular "toolbox" based initially on a matlab translation. I used var = 1 - R, and a wrapper that converts between the circle units (degrees, hours or whatever) into radians and back. AFAIR (!), I never managed to figure out these scipy stats function and left them to Travis. (I thought differences might be due to scaling. There are also weird things in R packages, where I wasn't quite sure where their scaling or transformation comes from.) |
The circular statistic function circvar is updated to estimate circular variance simply as one minus the mean resultant vector (1-R), instead of using approximation to the linear variance. This definition is compatible with the literature (e.g., Fisher, 1993; Jammalamadaka and SenGupta, 2001) and other implementations in statistical packages (e.g., R and Matlab). Optional boolean parameter ``normalize`` added to both circvar and cirstd determines whether the returned values are independent on the variable units, or whether the values are scaled by the range of variable values (currently default not to break compatability). Docstring of circvar, circstd and circmean have more accurate description for the parameters ``high`` and ''low``.
e6011cb
to
64f36b0
Compare
A new push:
|
I gather there might be some use of / experience with these sorts of functions in astropy land. |
+1 for this feature. Also, it might be also good to allow a |
scipy/stats/morestats.py
Outdated
@@ -2671,18 +2671,19 @@ def _circfuncs_common(samples, high, low): | |||
|
|||
def circmean(samples, high=2*pi, low=0, axis=None): | |||
""" | |||
Compute the circular mean for samples in a range. | |||
Compute the circular mean for samples assumed to be in the | |||
range [low to high]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolving merge conflicts here means adding all the changes to _morestats.py
manually. I'm not making this chance because the changes to these headings because they need to be a one-line description.
@kleskjr I'd like to help you finish this up. Can you select "Allow edits from maintainers"? |
@kleskjr Thought I'd check in again. If you could allow select "Allow edits from maintainers", I'd appreciate it! |
@mdhaber, edits are allowed. Sorry for the delay! |
Thanks @kleskjr. How important do you think the In
The validity of |
@steppi I suspect that the reason this hasn't already been fixed is that others are unsure of the math. That's why I tagged you, but NP if it's not of interest. |
Definitely of interest but my time is scarce now. I think I can take a look in 2-3 weeks. |
@mdhaber I concur. |
normalize : boolean, optional | ||
If True, the returned value is equal to ``sqrt(-2*log(R))`` and does | ||
not depend on the variable units. If False (default), the returned | ||
value is scaled by ``((high-low)/(2*pi))``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the variable units" is unclear.
normalize : boolean, optional | |
If True, the returned value is equal to ``sqrt(-2*log(R))`` and does | |
not depend on the variable units. If False (default), the returned | |
value is scaled by ``((high-low)/(2*pi))``. | |
normalize : boolean, optional | |
If True, the returned value is equal to ``sqrt(-2*log(R))`` and does | |
not depend on the units of `samples`. If False (default), the returned | |
value is scaled by ``((high-low)/(2*pi))``. See Notes for details. |
Let's commit this when it's ready to merge. I've checked the line lengths, so I don't think CI is needed ([skip ci]
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. It's been 5 years since I've done anything with directional statistics, but I've double-checked @kleskjr's references and agree that this is the most reasonable definition of the circular variance. I also agree with not including a normalize option for cirvar
.
There were no replies to the mailing list post, there is +1 explicit approval from @steppi (and me), and there seems to be other support above, so I'm merging. I added a note about the backward-incompatible change in the release notes. Thanks for the review @steppi! |
* master: (632 commits) Update _dual_annealing.py (scipy#15939) TST: stats: make `check_sample_var` two-sided (scipy#15723) DOC: sparse.linalg: add citations for COLAMD DOC: fix missing comma in conf ENH/MAINT: Version switcher from the sphinx theme (scipy#15380) MAINT: stats: remove support for `_rvs` without `size` parameter (scipy#15917) BUG: Handle base case for scipy.integrate.simpson when span along the axis is 0 (scipy#15824) MAINT: stats: adjust tolerance for failing test only MAINT: stats: adjust tolerance of failing TestTruncnorm MAINT: special: Clean up C style in ndtr.c CI: remove pin on Jinja2 (scipy#15895) STY: Remove white spaces MAINT: stats: exempt gilbrat from refguide_check MAINT: stats: rename continuous_gilbrat->continuous_gibrat MAINT: stats: correct name Gilbrat -> Gibrat [ENH] circvar calculated simply as 1-R (scipy#5747) DEP: deal with deprecation of ndim >1 in bspline MAINT: stats: more specific error type from `rv_continuous.fit` (scipy#15778) DOC: fix import in example in _morestats (scipy#15900) [BUG] make p-values consistent with the literature (scipy#15894) ...
The circular statistic function circvar is updated to estimate circular
variance simply as one minus the mean resultant vector (1-R), instead
of using approximation to the linear variance. Thus, the returned value
is in the range [0, 1]. This definition is compatible with the
literature (e.g., Fisher, 1993; Jammalamadaka and SenGupta, 2001) and
other implementations in statistical packages (e.g., R and Matlab).