ENH: `stats.boxcox_llf`: add array API support #21097

lucascolley · 2024-07-02T11:29:13Z

Reference issue

Towards gh-20544

What does this implement/fix?

Additional information

Seems quite simple!

[skip cirrus] [skip circle]

mdhaber · 2024-07-02T15:53:53Z

scipy/stats/_morestats.py

    else:
        # Transform without the constant offset 1/lmb.  The offset does
        # not affect the variance, and the subtraction of the offset can
        # lead to loss of precision.
        # Division by lmb can be factored out to enhance numerical stability.
        logx = lmb * logdata
+        # convert to `np` for `special.logsumexp`
+        logx = np.asarray(logx)


I suppose we can merge this but I'll add to the list that we need to come back after special.logsumexp is done.

mdhaber · 2024-07-02T15:57:56Z

scipy/stats/_morestats.py


    # Compute the variance of the transformed data.
    if lmb == 0:
-        logvar = np.log(np.var(logdata, axis=0))
+        logvar = xp.log(xp.var(logdata, axis=0))


Separately, we'll want to add axis support. Would you be willing to do that as a follow-up?

scipy/stats/_morestats.py

mdhaber · 2024-07-02T16:03:17Z

scipy/stats/tests/test_morestats.py

 class TestBoxcox_llf:

-    def test_basic(self):
+    def test_basic(self, xp):


Looks like there are no tests of native dtype or float32. What do you think of parametrizing this one over those possibilities?

Looks like we need some tol changes

I think it's just that somewhere in the float32 calculation the result is converted to an inaccurate float64. The reference value is still float64, so the test didn't fail due to dtype. So if the dtype is not promoted accidentally and the reference value is the desired dtype, I think it will be ok.

Update: the bug is in _log_var and _log_mean introduced in gh-19604. (I don't blame us; it is a pre-NEP 50 thing.) At the time, it appeared that boxcox_llf would always return float64 regardless of input dtype, so two things were done under that assumption: conversion of the input of _log_var to complex128 regardless of input dtype, and use of np.log(len(logx)) (in both _log_var and _log_mean) which will always cause the result to be float64. However, with NEP 50 rules / NumPy 2.0 (or other backends that follow array type promotion rules), the current lmb=0 code path results in float32 output, and the code before gh-19604 would have preserved the input dtype. So assuming you don't want to work on _log_var and _log_mean in this PR, I think the way to fix this is to coerce the dtype of the output of the function to the input dtype. The conversion can be removed when NumPy 2.0 is the minimum supported version and _log_var/_log_mean functions are fixed. Or, after fixing _log_var/_log_mean, we could let pre-NEP 50 NumPy follow its own rules and make a dtype exception in the tests.

[skip cirrus] [skip circle]

mdhaber · 2024-07-07T01:00:39Z

Thanks @lucascolley. I went ahead and merged. There is some follow-up work to be done, so I'll leave the box unchecked. I think we need to:

resurrect logsumexp
add array API support to _log_var and _log_mean
add axis support

Interested in some of those?

* ENH: `stats.boxcox_llf`: add array API support

lucascolley · 2024-07-21T22:51:09Z

@mdhaber could you give me some pointers for what axis support should look like? I'll send a PR.

mdhaber · 2024-07-22T01:01:30Z

Yeah! For the most part, it just means adding axis as a kwarg to the function and consistently passing it to the reducing functions called within (var, sum, and logsumexp). Let's carry that through _log_mean and _log_var, too. One nice thing is that the example that loops through all the values of $\lambda$ can be replaced with a single vectorized call. When the PR is up, I can suggest how to add the _axis_nan_policy decorator and add the standard tests. We won't worry about array API initially. How does that sound?

ENH: stats.boxcox_llf: add array API support

ca6c197

[skip cirrus] [skip circle]

github-actions bot added scipy.stats enhancement A new feature or improvement labels Jul 2, 2024

lucascolley requested a review from mdhaber July 2, 2024 11:31

lucascolley added this to the 1.15.0 milestone Jul 2, 2024

lucascolley mentioned this pull request Jul 2, 2024

ENH: stats: add array API-support #20544

Open

mdhaber reviewed Jul 2, 2024

View reviewed changes

scipy/stats/_morestats.py Show resolved Hide resolved

mdhaber reviewed Jul 2, 2024

View reviewed changes

adjustments per review [skip cirrus] [skip circle]

8badfb5

lucascolley added the array types Items related to array API support and input array validation (see gh-18286) label Jul 4, 2024

add dtype cast before return

050a191

[skip cirrus] [skip circle]

lucascolley requested a review from mdhaber July 6, 2024 13:15

mdhaber merged commit 9e688a9 into scipy:main Jul 7, 2024

lucascolley deleted the xp_boxcox_llf branch July 7, 2024 08:22

ev-br pushed a commit to ev-br/scipy that referenced this pull request Jul 14, 2024

ENH: stats.boxcox_llf: add array API support (scipy#21097)

d1503f2

* ENH: `stats.boxcox_llf`: add array API support

mdhaber mentioned this pull request Jul 22, 2024

ENH: stats.alexandergovern: vectorize calculation for n-D arrays #21089

Merged

lucascolley mentioned this pull request Sep 7, 2024

ENH: stats.boxcox_llf: add GPU support #21518

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: `stats.boxcox_llf`: add array API support #21097

ENH: `stats.boxcox_llf`: add array API support #21097

Uh oh!

lucascolley commented Jul 2, 2024

Uh oh!

mdhaber Jul 2, 2024

Uh oh!

mdhaber Jul 2, 2024 •

edited

Loading

Uh oh!

lucascolley Jul 2, 2024

Uh oh!

Uh oh!

mdhaber Jul 2, 2024

Uh oh!

lucascolley Jul 4, 2024

Uh oh!

mdhaber Jul 5, 2024 •

edited

Loading

Uh oh!

mdhaber commented Jul 7, 2024 •

edited

Loading

Uh oh!

lucascolley commented Jul 21, 2024

Uh oh!

mdhaber commented Jul 22, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ENH: stats.boxcox_llf: add array API support #21097

ENH: stats.boxcox_llf: add array API support #21097

Uh oh!

Conversation

lucascolley commented Jul 2, 2024

Reference issue

What does this implement/fix?

Additional information

Uh oh!

mdhaber Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

mdhaber Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucascolley Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mdhaber Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

lucascolley Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

mdhaber Jul 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdhaber commented Jul 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucascolley commented Jul 21, 2024

Uh oh!

mdhaber commented Jul 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ENH: `stats.boxcox_llf`: add array API support #21097

ENH: `stats.boxcox_llf`: add array API support #21097

mdhaber Jul 2, 2024 •

edited

Loading

mdhaber Jul 5, 2024 •

edited

Loading

mdhaber commented Jul 7, 2024 •

edited

Loading

mdhaber commented Jul 22, 2024 •

edited

Loading