-
Notifications
You must be signed in to change notification settings - Fork 9.7k
promql: histogram_fraction for bucket histograms #16095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
promql: histogram_fraction for bucket histograms #16095
Conversation
aff63b6
to
cebd907
Compare
cebd907
to
d35d324
Compare
Nice! I believe the only question is -- do we (and users) accept the kind of errors you will get from calculating this from fixed buckets e.g. if you have only le=0.1 and le=10, le=100 and you do Anyway, I think it makes sense to extrapolate some usefulness here anyway, with the right documentations and maybe errors if potential start,end does not make sense? 🤔 I'm sure @beorn7 has opinions on this already too. |
We have to use the _buckets to get the "le" label at least. The calculation is essentially the same as for native histograms, it just depends on "good" choices for bucket boundaries more ( i wonder if linear interpolation is right choice though! ). If we agree that such a function would be valuable then we should add the documentation that its an approximation if the "upper/lower" parameters do not align with any bucket boundaries though - i think that is sensible! |
Nitty technical thoughts and answers:
|
My general thoughts: tl;dr: If a sufficient amount of people desire this sufficiently strongly, I guess we can add this. There is not much technical downside to adding it (implementation is not very complicated; it doesn't add "yet another function" to PromQL, just modifies the behavior of an existing function). I'm not convinced, though, that this is something that people should desire… I haven't really bought yet the two prime incentives for this:
|
Conclusion: If you left it to me to decide, I would say no. But whoever feels strongly that we should have this, please say so. You don't even have to convince me. You just have to state that you are convinced. If there is critical mass for this (and no objections against it otherwise), I will not oppose it. |
The problem is this work is not "instant". We have thousands of Prometheus instances for thousands of services. It's going to take us months to migrate and cross-coordinate tooling with teams. And we might need to rollout and rollback native histograms. So by having the option to pre-convert queries from classic to native allows us to stagger the changes across various teams. So, I feel fairly strongly that this is something very useful for us to support. It allows for flexibility forwards and backwards in mixed deployment environments. |
But the query is still different, isn't it? Or are you doing something like |
Straw poll on Slack got a few thumbs up and no thumbs down. So let's do this. I'll review this PR ASAP. @MichaHoffmann could you update the documentation in the meantime? It should be part of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. A bit more testing would be good. Also to reduce the code duplication where it is easy. (But don't sweat it. I think a bit of duplication is better than convoluted contraptions to make the code as DRY as possible.)
And as said before, we need an update for the documentation.
Thank you! Ill address the comments over the weekend! |
1535c77
to
7ae848c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
Code looks good, let's just improve documentation a bit more as described in the comment.
@MichaHoffmann are you still working on this? As said, we are only missing some documentation improvements. |
Yeah! Will add your suggestions - I just didn't find a lot of energy and time lately to do it ! Ill do it over the weekend |
This PR extends the histogram_fraction function to also work with classic bucket histograms. This is beneficial because it allows expressions like sum(increase(my_bucket{le="0.5"}[10m]))/sum(increase(my_total[10m])) to be written without knowing the actual values of the "le" label, easing the transition to native histograms later on. It also feels natural since histogram_quantile also can deal with classic histograms. Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
7ae848c
to
61c0cf8
Compare
* Add documentation and reduce code duplication * Fix a bug in linear interpolation between bucket boundaries * Add more PromQL tests Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
61c0cf8
to
02cfc46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
What do you think about my refinement?
Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Yeah, reads good to me, i committed the suggestion! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
* Upgrade mimir-prometheus * Fix breaking change in `NewAPI` * Sync upstream PromQL engine test cases * Modify `histogram_stddev` and `histogram_stdvar` to match new behaviour introduced in prometheus/prometheus#16444 * Simplify code * Remove outdated comment This issue was fixed in prometheus/prometheus#15828. * Remove unnecessary variable * Rename operator and files * Extract `histogram_quantile`-specific logic * Add support for classic histograms in `histogram_fraction` See prometheus/prometheus#16095 * Bring in changes from grafana/mimir-prometheus#871 * Invert build tag logic to match prometheus/prometheus#16436 * Sync query engine tests with upstream * Fix breaking changes * Regenerate OTLP code
This PR extends the
histogram_fraction
function to also work with classic bucket histograms. This is beneficial because it allows expressions likesum(increase(my_bucket{le="0.5"}[10m]))/sum(increase(my_total[10m]))
to be written without knowing the actual values of the "le" label, easing the transition to native histograms later on.It also feels natural since
histogram_quantile
also can deal with classic histograms.The code was cobbled together from the parts that let
histogram_quantile
work with buckets and native histograms and parts of theHistogramFraction
functions adjusted for classic histograms. Being cobbled together like this it involves a bit of code duplication that could probably be refactored. I wanted to open the PR already though to see if this is reasonable in the first place though!