-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ENH: stats.boxcox_llf
: add array API support
#21097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[skip cirrus] [skip circle]
else: | ||
# Transform without the constant offset 1/lmb. The offset does | ||
# not affect the variance, and the subtraction of the offset can | ||
# lead to loss of precision. | ||
# Division by lmb can be factored out to enhance numerical stability. | ||
logx = lmb * logdata | ||
# convert to `np` for `special.logsumexp` | ||
logx = np.asarray(logx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can merge this but I'll add to the list that we need to come back after special.logsumexp
is done.
|
||
# Compute the variance of the transformed data. | ||
if lmb == 0: | ||
logvar = np.log(np.var(logdata, axis=0)) | ||
logvar = xp.log(xp.var(logdata, axis=0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separately, we'll want to add axis
support. Would you be willing to do that as a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try!
scipy/stats/tests/test_morestats.py
Outdated
class TestBoxcox_llf: | ||
|
||
def test_basic(self): | ||
def test_basic(self, xp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are no tests of native dtype or float32
. What do you think of parametriz
ing this one over those possibilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we need some tol changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's just that somewhere in the float32
calculation the result is converted to an inaccurate float64
. The reference value is still float64
, so the test didn't fail due to dtype
. So if the dtype
is not promoted accidentally and the reference value is the desired dtype, I think it will be ok.
Update: the bug is in _log_var
and _log_mean
introduced in gh-19604. (I don't blame us; it is a pre-NEP 50 thing.) At the time, it appeared that boxcox_llf
would always return float64
regardless of input dtype, so two things were done under that assumption: conversion of the input of _log_var
to complex128
regardless of input dtype, and use of np.log(len(logx))
(in both _log_var
and _log_mean
) which will always cause the result to be float64
. However, with NEP 50 rules / NumPy 2.0 (or other backends that follow array type promotion rules), the current lmb=0
code path results in float32
output, and the code before gh-19604 would have preserved the input dtype. So assuming you don't want to work on _log_var
and _log_mean
in this PR, I think the way to fix this is to coerce the dtype of the output of the function to the input dtype. The conversion can be removed when NumPy 2.0 is the minimum supported version and _log_var
/_log_mean
functions are fixed. Or, after fixing _log_var
/_log_mean
, we could let pre-NEP 50 NumPy follow its own rules and make a dtype exception in the tests.
[skip cirrus] [skip circle]
Thanks @lucascolley. I went ahead and merged. There is some follow-up work to be done, so I'll leave the box unchecked. I think we need to:
Interested in some of those? |
* ENH: `stats.boxcox_llf`: add array API support
@mdhaber could you give me some pointers for what |
Yeah! For the most part, it just means adding |
Reference issue
Towards gh-20544
What does this implement/fix?
Additional information
Seems quite simple!