-
Notifications
You must be signed in to change notification settings - Fork 9.8k
[BUGFIX] PromQL: fix incorrect rounding of [1001ms]
to [1s]
and similar (selector sometimes misses a sample on dense samples)
#16478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In short, floating-point multiplication and floor rounding together caused this bug. So I fixed the conversion from Below is a simple example explaining everything. package main
import (
"fmt"
"math"
"time"
)
func OutdatedImpl(inSecondDuration float64) time.Duration {
return time.Duration(inSecondDuration*1000) * time.Millisecond
}
func UpdatedImpl(inSecondDuration float64) time.Duration {
return time.Duration(math.Round(inSecondDuration * float64(time.Second)))
}
func main() {
inSecondDurations := []float64{1.000, 1.001, 1.002}
var duration time.Duration
for _, inSecondDuration := range inSecondDurations {
duration = OutdatedImpl(inSecondDuration)
fmt.Printf("[outdated] duration in s: %v, in ms: %v, in ns: %v\n", inSecondDuration, duration.Milliseconds(), duration.Nanoseconds())
duration = UpdatedImpl(inSecondDuration)
fmt.Printf("[updated] duration in s: %v, in ms: %v, in ns: %v\n\n", inSecondDuration, duration.Milliseconds(), duration.Nanoseconds())
}
}
|
Signed-off-by: Zhang Zhanpeng <zhangzhanpeng.zzp@alibaba-inc.com>
bad33ac
to
0658923
Compare
Nice catch. prometheus/promql/parser/generated_parser.y Lines 962 to 982 in 26bddcf
and only converted to ms afterward. Wouldn’t it make more sense to parse them as ms from the start, given we want to have ms precision? |
+1, the loss of precision is introduced by It seems like I should ask the author of this code block about the considerations for converting to seconds instead of milliseconds? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue where the function selector may miss a sample on dense samples. The key changes include:
- Adding new test cases in parse_test.go to cover various millisecond input cases.
- Modifying duration computations in generated_parser.y.go by replacing multiplication logic with math.Round.
Reviewed Changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
promql/parser/parse_test.go | Adds test cases for MatrixSelector with varied millisecond inputs. |
promql/parser/generated_parser.y.go | Updates duration conversion logic to use math.Round. |
Files not reviewed (2)
- promql/parser/generated_parser.y: Language not supported
- promql/promqltest/testdata/functions.test: Language not supported
Comments suppressed due to low confidence (4)
promql/parser/generated_parser.y.go:1375
- Using math.Round with a time.Second multiplier may round the offset to a whole second, potentially losing millisecond precision. Verify that this behavior correctly handles dense sample scenarios as intended.
yylex.(*parser).addOffset(yyDollar[1].node, time.Duration(math.Round(numLit.Val*float64(time.Second))))
promql/parser/generated_parser.y.go:1426
- Replacing the previous millisecond conversion with math.Round may result in rounding duration values to the nearest whole second instead of maintaining millisecond accuracy. Ensure that this change gives the intended behavior under all expected inputs.
rangeNl = time.Duration(math.Round(numLit.Val * float64(time.Second)))
promql/parser/generated_parser.y.go:1446
- The introduction of math.Round for step duration may round to whole seconds, which could affect subquery behavior if sub-second precision is needed. Confirm that the loss of millisecond granularity is acceptable.
stepNl = time.Duration(math.Round(numLit.Val * float64(time.Second)))
promql/parser/generated_parser.y.go:1463
- Rounding the range using math.Round shifts the precision from milliseconds to seconds; ensure this modification aligns with the intended fix for missing samples on dense metrics.
rangeNl = time.Duration(math.Round(numLit.Val * float64(time.Second)))
I think the changes were made in #9138 |
It is converted to seconds because when using a plain number in Prometheus, it is considered to be seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we move that to a parseNumLit function ?
But 1001ms isn't a plain number, as I mentionedin #16478 (comment), this wasn't the case before #9138 I'm not a big fan of "let's parse all numbers the same way, but let's have some special logic for durations", it even goes against #9138. |
#9138 makes them all similar. What would you propose as a fix? I believe the fix could also be to introduce a .Duration() time.Duration to NumberLiteral and use it when we need it. |
I haven't looked closely at the code, so I can't provide a complete/working solution. That said, I do think it's a pity that we lose such info/precision during parsing. I was wondering in #16478 (comment) we could just return durations in ms instead of s in some cases. Or even for all cases given we want ms precision, and then make sure the ms->s is done correctly when needed (I don't see why we'd need that, if we only work with ms) |
For now this is a correct fix for the issue. I think we could change to Milliseconds but at the expense of potentially breaking downstream users, do I am happy with this PR as it. |
Thanks! |
[1001ms]
to [1s]
and similar (selector sometimes misses a sample on dense samples)
Hi, I was rewording descriptions for the changelog to make them more useful for end-users, so I expanded the title of this PR. I did not understand what "function selector" was supposed to mean, so dropped the word "function". Something like |
@bboreham Thanks! Your revision is spot on. |
Fixes #16247