-
Notifications
You must be signed in to change notification settings - Fork 9.7k
fix(promql): histogram_quantile and histogram_fraction NaN observed in native histogram #16724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e276fd7
to
abef671
Compare
Fix TODO. Related to prometheus/prometheus#16578 Related to prometheus/prometheus#16724 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
abef671
to
0553341
Compare
4641db0
to
3ff6dce
Compare
Fix TODO. Related to prometheus/prometheus#16578 Related to prometheus/prometheus#16724 Related to prometheus/prometheus#16580 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Fix TODOs. And align with new implementation. Related to prometheus/prometheus#16578 Related to prometheus/prometheus#16724 Related to prometheus/prometheus#16580 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Fixes: #16578 See the issue for detailed explanation. When a histogram had only NaN observations and no normal observations, we returned 0 from the quantile, which is completely wrong. If there were normal observations but we went over them, we returned the upper bound of the existing buckets, however that contradicts expectations on histogram_fraction. Now we return NaN if the quantile is calculated to be over all normal observations, falling into NaNs (in a virtual +Inf bucket). We also return info level annotations if we see any NaN observations. The annotation calls out if we returned NaN or even if we took the virtual +Inf bucket into account. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Fixes: #16580 According to the specification we should not take NaN observations into account when calculating the fraction. This commit fixes that and adds an info level annotation to let the user know about this. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
3ff6dce
to
61bb15a
Compare
Fix TODOs. And align with new implementation. Related to prometheus/prometheus#16578 Related to prometheus/prometheus#16724 Related to prometheus/prometheus#16580 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, just minor things.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's keep the old behavior for histograms where the sum is not NaN and only do the NaN return for histograms where the sum is NaN.
The histogram could just be in breach of the spec, of course, but then it is still a defined behavior to return bucket.Upper as we have don so far anyway.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
…2676) * fix(nh-spec): update histogram_quantile and histogram_fraction docs Fix TODOs. And align with new implementation. Related to prometheus/prometheus#16578 Related to prometheus/prometheus#16724 Related to prometheus/prometheus#16580 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Fixes: #16578
See the issue for detailed explanation.
When a histogram had only NaN observations and no normal observations,
we returned 0 from the quantile, which is completely wrong. If there were
normal observations but we went over them, we returned the upper bound of
the existing buckets, however that contradicts expectations on
histogram_fraction. Now we return NaN if the quantile is calculated to be
over all normal observations, falling into NaNs (in a virtual +Inf bucket).
We also return info level annotations if we see any NaN observations.
The annotation calls out if we returned NaN or even if we took the
virtual +Inf bucket into account.
Fixes: #16580
According to the specification we should not take NaN observations
into account when calculating the fraction. This commit fixes that
and adds an info level annotation to let the user know about this.