-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Prom stats perf improvements #24998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prom stats perf improvements #24998
Conversation
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
@ggreenway can you PTAL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good. Thanks for making the test changes more minimal; that made reviewing much easier.
/wait
source/server/admin/stats_render.cc
Outdated
const std::string& prefixed_tag_extracted_name) { | ||
auto parent_histogram = dynamic_cast<Stats::ParentHistogram*>(metric.get()); | ||
if (parent_histogram != nullptr) { | ||
// TODO(rulex123): support for the 3 histogram "bucket modes" offered in other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a TODO. There is only one valid way to represent histograms in the prometheus exposition format, and this is it. I think you can just delete the TODO comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense.
FYI in #26472 I have added a new bucket mode tentatively called 'detailed' which will have the raw bucket data. That may also be useful for Prom as you can see a lot more interesting data with this IMO, but it might complicate aggregation in Prom across multiple Envoys.
Don't worry Greg -- that PR is still a ways away from being atomized into probably around 4 for review, but I think the first PR will be the addition of a new bucket-mode to show the original buckets computed by circlhist.
Nevertheless I agree removing this TODO is fine for now.
Signed-off-by: rulex123 <erica.manno@gmail.com>
/retest |
Retrying Azure Pipelines: |
Commit Message: Reproduces a scenario where it's difficult to use a streaming optimization for Prometheus stats based on scopes without introducing a bug. Context: Issue that /stats/prometheus endpoint takes too much much memory: Prometheus stats handler used too much memory. #16139 Solution for non-Prometheus endpoints to use less memory for /stats and run faster: admin: Streaming /stats implementation #19693 which I believe is working well. This solution mapped to Prometheus: Prom stats perf improvements #24998 A case where this solution has a duplicate# TYPE line due to two stats with the same tag-extracted-stat name from two different scopes: stats: prometheus scrape failed #27173 Note that the existing unit tests did not exercise that scenario so this PR adds a testcase. Signed-off-by: Joshua Marantz <jmarantz@google.com>
(i.e. doesn't cater for stats with identical tag extracted name but from different scopes). Signed-off-by: rulex123 <erica.manno@gmail.com>
Add support for streaming/chunking of prometheus stats - basically adapts the algorithm introduced in envoyproxy#19693 for prometheus stats (while keeping common parts of the algorithm in a shared class to avoid code duplication). With this patch we should be able to improve the issue described here envoyproxy#16139 Signed-off-by: rulex123 <erica.manno@gmail.com> Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
Commit Message: Reverts envoyproxy#24998 See envoyproxy#27173 Additional Description: Risk Level: n/a Testing: n/a Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
…xy#27239) Commit Message: Reproduces a scenario where it's difficult to use a streaming optimization for Prometheus stats based on scopes without introducing a bug. Context: Issue that /stats/prometheus endpoint takes too much much memory: Prometheus stats handler used too much memory. envoyproxy#16139 Solution for non-Prometheus endpoints to use less memory for /stats and run faster: admin: Streaming /stats implementation envoyproxy#19693 which I believe is working well. This solution mapped to Prometheus: Prom stats perf improvements envoyproxy#24998 A case where this solution has a duplicate# TYPE line due to two stats with the same tag-extracted-stat name from two different scopes: stats: prometheus scrape failed envoyproxy#27173 Note that the existing unit tests did not exercise that scenario so this PR adds a testcase. Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
Commit Message: patch to add support for streaming/chunking of prometheus stats - basically adapts the algorithm introduced in #19693 for prometheus stats (while keeping common parts of the algorithm in a shared class to avoid code duplication).
With this patch we should be able to improve the issue described here #16139
Risk Level: only underlying implementation changes, but otherwise no change of behaviour/features. So low risk I would say.
Testing: current tests should remain unchanged (though they might be moved around due to refactoring of code); new tests are also added.
Docs Changes: n/a
Release Notes: n/a
[Optional Runtime guard:]
[Optional Fixes #Issue]: #16139
Draft PR (already went through some review) -> #24716
Benchmark data from running
stats_handler_speed_test.cc
on a dev box:BEFORE
AFTER