BUG: special.logsumexp: fix precision issue #18424

sadigulcelik · 2023-05-05T02:46:49Z

Closes #18295. Updates scipy.special.logsumexp calculation with more precise logsumexp calculation outlined in #18295 where possible.

Reference issue

What does this implement/fix?

Additional information

Closes scipy#18295. Updates logsum calculation with more precise logsum calculation outlined in scipy#18295 where possible.

scipy/special/tests/test_logsumexp.py

Fix test to assert_allclose to measure error at appropriate tolerance Co-authored-by: Matteo Raso <33975162+MatteoRaso@users.noreply.github.com>

ev-br

Looks like a good start!

I think the implementation could use some explanatory comments and readability tweaks.
Also would be good to add a couple of tests of a similar accuracy problems with higher dimensional arrays and exercise non-default axis and/or keepdims.

scipy/special/_logsumexp.py

scipy/special/tests/test_logsumexp.py

lorentzenchr

Overall, it looks good. Some improvements might be possible.

lorentzenchr · 2023-05-22T20:50:37Z

scipy/special/_logsumexp.py

+     #          = a_max + log(m + R )
+     #          = a_max + log(m) + log(1 + (1/m) * R)
+
+    tmp0 = (a == a_max)


Suggested change

tmp0 = (a == a_max)

mask = (a == a_max)

Some more meaningful name might help reading the code.

lorentzenchr · 2023-05-22T20:51:41Z

scipy/special/_logsumexp.py

-        tmp = b * np.exp(a - a_max)
+
+        # sumexp for a != a_max
+        tmp = b * np.exp(a - a_max)  * (~tmp0)


Suggested change

tmp = b * np.exp(a - a_max) * (~tmp0)

R = b * np.exp(a - a_max) * (~tmp0)

Why not call it R like in the comment above? I know that my suggestion is not consistent as I only wanted to show the idea.

lorentzenchr · 2023-05-22T20:52:39Z

scipy/special/_logsumexp.py

+        tmp = b * np.exp(a - a_max)  * (~tmp0)
+
+        # sumexp for where a = a_max
+        tmp0 = b*tmp0.astype(float)


Suggested change

tmp0 = b*tmp0.astype(float)

m = b * tmp0.astype(float)

To align naming with comment.
What happens if a np.float32 is passed as a, does this cast to float(64) change the behavior of the current implementation, i.e. the dtype of the returned value?

lorentzenchr · 2023-05-22T21:02:46Z

scipy/special/_logsumexp.py


    # suppress warnings about log of zero
    with np.errstate(divide='ignore'):
        s = np.sum(tmp, axis=axis, keepdims=keepdims)
+        s0 = np.sum(tmp0, axis=axis, keepdims=keepdims)
+        sf = s + s0


Suggested change

sf = s + s0

s = s0 + s1

It would be nice if the result, ie the sum of both terms, is called s as before.

lorentzenchr · 2023-05-22T21:09:14Z

scipy/special/_logsumexp.py

+            sf*=sgn
+
+
+        precise = ((s0>0) & (s>0))


From here on, it seems to me that the code could be simpler. s0 and s are either 0 or positive and s==0 does not hurt. So, only special casing for s0==0 should be sufficient.

dschmitz89 · 2023-05-24T16:30:26Z

Not trying to block this PR but wouldn't it make sense in the long run to implement logsumexp in cython or C so that it can be exposed via cython_special too? Sklearn carries a cython implementation, another case of duplication across the ecosystem.

lorentzenchr · 2023-06-06T12:18:49Z

Not trying to block this PR but wouldn't it make sense in the long run to implement logsumexp in cython or C so that it can be exposed via cython_special too? Sklearn carries a cython implementation, another case of duplication across the ecosystem.

Note that even if it were available in scipy Cython API, scikit-learn would probably still have its own Cython implementation to make inlining possible. Same story is already true for expit. This makes some performance difference that I measured once.

lucascolley · 2024-01-08T14:58:41Z

Hey @sadigulcelik , would you like to return to this? There are some comments above to address but it sounds like this was close.

ev-br · 2024-05-20T15:58:51Z

Needs a rebase now. Would be great to decide on this one either way.

tylerjereddy · 2024-05-27T21:40:35Z

This PR hasn't seen commit activity for more than a year, so I think bumping the milestone is the right call. May need a new champion if the original author is busy.

lucascolley · 2024-09-21T14:12:17Z

It looks like this PR will be superseded by gh-21597. Thanks for this anyway @sadigulcelik!

BUG: fixes precision issue with logsumexp

59b9e5c

Closes scipy#18295. Updates logsum calculation with more precise logsum calculation outlined in scipy#18295 where possible.

sadigulcelik requested a review from person142 as a code owner May 5, 2023 02:46

MatteoRaso suggested changes May 5, 2023

View reviewed changes

scipy/special/tests/test_logsumexp.py Outdated Show resolved Hide resolved

Update scipy/special/tests/test_logsumexp.py

233e698

Fix test to assert_allclose to measure error at appropriate tolerance Co-authored-by: Matteo Raso <33975162+MatteoRaso@users.noreply.github.com>

ev-br requested changes May 5, 2023

View reviewed changes

ev-br added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.special needs-work Items that are pending response from the author labels May 5, 2023

TST: add testing and documentation for change to logsumexp

0da2ac9

sadigulcelik requested a review from ev-br May 5, 2023 20:15

lorentzenchr approved these changes May 22, 2023

View reviewed changes

lucascolley added this to the 1.14.0 milestone Mar 16, 2024

lucascolley changed the title ~~BUG: fixes precision issue with logsumexp~~ BUG: special.logsumexp: fix precision issue May 18, 2024

tylerjereddy modified the milestones: 1.14.0, 1.15.0 May 27, 2024

mdhaber mentioned this pull request Sep 20, 2024

ENH: special.logsumexp: improve precision when one element is much bigger than the rest #21597

Merged

lucascolley closed this Sep 21, 2024

lucascolley removed this from the 1.15.0 milestone Sep 21, 2024

	tmp = b * np.exp(a - a_max) * (~tmp0)
	R = b * np.exp(a - a_max) * (~tmp0)

Uh oh!

BUG: special.logsumexp: fix precision issue #18424

BUG: special.logsumexp: fix precision issue #18424

Uh oh!

Conversation

sadigulcelik commented May 5, 2023

Reference issue

What does this implement/fix?

Additional information

Uh oh!

Uh oh!

ev-br left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lorentzenchr left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr May 22, 2023

Choose a reason for hiding this comment

Uh oh!

lorentzenchr May 22, 2023

Choose a reason for hiding this comment

Uh oh!

lorentzenchr May 22, 2023

Choose a reason for hiding this comment

Uh oh!

lorentzenchr May 22, 2023

Choose a reason for hiding this comment

Uh oh!

lorentzenchr May 22, 2023

Choose a reason for hiding this comment

Uh oh!

dschmitz89 commented May 24, 2023

Uh oh!

lorentzenchr commented Jun 6, 2023

Uh oh!

lucascolley commented Jan 8, 2024

Uh oh!

ev-br commented May 20, 2024

Uh oh!

tylerjereddy commented May 27, 2024

Uh oh!

lucascolley commented Sep 21, 2024

Uh oh!

Uh oh!