Optimize admin stats memory use

*Title*: *Optimize admin stats memory use*

*Description*:

There are a few related problems which I think can be tackled independently, all of which are related to memory consumption in stats endpoints when there are a large number of stats (e.g. due to a large number of clusters).

1. The browser may crash if you send stats for 100k clusters. There's simply too much data especially with serialized names
2. The server may crash due to implementations buffering all the stats to sort them and send them out in one chunk
3. Prometheus admin stats handler cost too much memory as #16139(https://github.com/envoyproxy/envoy/issues/16591) said. Now all serialized bytes are always buffered in the admin handler, then they may also be buffered in the networking layer for a slow client.

A challenge with approaches around streaming data out from /stats or for Prometheus is that the data is held in unsorted hash-maps by the stats allocator/store, and we need to present fully sorted data to users looking at admin /stats, and collated data to Prometheus, due to tag grouping (I think...I'm not a Prometheus expert).

@pradeepcrao has enabled solutions to these by adding in forEach type accessors into the stats system, at least for counters, gauges, and text-readouts. Histograms still need to be done. To tackle the above 3 symptoms, I am experimenting with a paging algorithm here: https://github.com/jmarantz/envoy/blob/stats-stream/source/common/stats/filter.h . This provides a possibly-efficient-enough ( _O (NumStats * Log(PageSize))_ ) algorithm to use with `forEachXXX` to get pages of sorted stats suitable for use with paging controls in a new flavor of admin page, e.g. `/admin/stats?format=html`. It may also provide 

Details refer to [Google docs](https://docs.google.com/document/d/1k2wyZo1pGr8_jaxd7wfpPHctR47bhQgk8E5gFl1HJLU/edit?usp=sharing).

@jmarantz please take a look and add anything if I miss.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize admin stats memory use #16981

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize admin stats memory use #16981

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions