-
Notifications
You must be signed in to change notification settings - Fork 9.7k
[BUGFIX] PromQL: Noinline kahanSumInc, to reduce numerical errors #16895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Observed on go1.24.4 darwin/arm64, the compensation variable is not correctly calculated. Note you may have to run the tests several times to see it, to get the right ordering of series. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
should we measure how much slower? if the impact is too large, maybe we can disable the potential |
Doesn't show up in the benchmarks. I added
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice findings
Thanks for this discovery. From my semi-informed perspective, this seems like a plausible explanation. With the Kahan trick inlined, I could see how the compiler says at some point: "Hey, this compensation term is mathematically always zero, I can just optimize it away." |
#### What this PR does See prometheus/prometheus#16895 for further explanation. We've copied the code into MQE, so we need to apply the same change here as well. #### Which issue(s) this PR fixes or relates to prometheus/prometheus#16895 #### Checklist - [n/a] Tests updated. - [n/a] Documentation added. - [n/a] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry is not needed, please add the `changelog-not-needed` label to the PR. - [n/a] [`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md) updated with experimental features.
Observed on go1.24.4 darwin/arm64, the compensation variable is not correctly calculated.
Note you may have to run the tests several times to see it, to get the right ordering of series.
Since the problem went away when I tried to run under the debugger (which disables optimisations), and went away when I added sufficient printf statements (which discourages inlining), I hit upon the idea of disabling inlining for this one function.
See #16714 (comment)
It's going to slow things down a bit, but turns out
IsInf
is pretty slow already.Fixes #16714.
Possibly related: https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Possible_invalidation_by_compiler_optimization