Skip to content

Optimize admin stats memory use #16981

@daixiang0

Description

@daixiang0

Title: Optimize admin stats memory use

Description:

There are a few related problems which I think can be tackled independently, all of which are related to memory consumption in stats endpoints when there are a large number of stats (e.g. due to a large number of clusters).

  1. The browser may crash if you send stats for 100k clusters. There's simply too much data especially with serialized names
  2. The server may crash due to implementations buffering all the stats to sort them and send them out in one chunk
  3. Prometheus admin stats handler cost too much memory as Prometheus stats handler used too much memory. #16139(buffer: add chunker #16591) said. Now all serialized bytes are always buffered in the admin handler, then they may also be buffered in the networking layer for a slow client.

A challenge with approaches around streaming data out from /stats or for Prometheus is that the data is held in unsorted hash-maps by the stats allocator/store, and we need to present fully sorted data to users looking at admin /stats, and collated data to Prometheus, due to tag grouping (I think...I'm not a Prometheus expert).

@pradeepcrao has enabled solutions to these by adding in forEach type accessors into the stats system, at least for counters, gauges, and text-readouts. Histograms still need to be done. To tackle the above 3 symptoms, I am experimenting with a paging algorithm here: https://github.com/jmarantz/envoy/blob/stats-stream/source/common/stats/filter.h . This provides a possibly-efficient-enough ( O (NumStats * Log(PageSize)) ) algorithm to use with forEachXXX to get pages of sorted stats suitable for use with paging controls in a new flavor of admin page, e.g. /admin/stats?format=html. It may also provide

Details refer to Google docs.

@jmarantz please take a look and add anything if I miss.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions