Skip to content

Conversation

npazosmendez
Copy link
Contributor

@npazosmendez npazosmendez commented Oct 30, 2023

Part of #13105

Context

This PR is the first of the multiple PRs to come that will iteratively work on the new Remote Write 2.0 format, that we intend to keep in the remote-write-1.1 feature branch. This PR in particular is focused on introducing string interning to the new format, and is a continuation of the work from #11999 which we iterated until the proposal of this PR.

The goal behind interning is to reduce resource usage in both senders and receivers: network egress, CPU and potentially memory. There's the question of whether just changing the compression algorithm from Snappy to something else is sufficient, and would make this interning not necessary. We explored that question and concluded interning is still a major win, see this comment for more details.

As a separate note, we will be including interning of all string data as part of this format via follow up PRs. This will include exemplar labels as well as per-series metadata, rebasing #11640 on top of the changes in this PR and leveraging the symbols interning.

This PR combines work/help/suggestions from @cstyan , @alexgreenbank, @npazosmendez, @bwplotka , @pracucci, @bboreham

How the interning works

The interning technique is simple: a write request includes a request-level []string with the purpose of de-duplicating strings across timeseries. Each timeseries encodes its labels as indices to that array, using a []uint32 of the form <lbl1_key_index>,<lbl1_val_index>, <lbl2_key_index>,<lbl2_val_index>....

See Prometheus Remote-Write 1.1 Specification for more details.

Benchmark

This benchmark uses the scripts in https://github.com/prometheus/prometheus/tree/remote-write-1.1/scripts/remotewrite11-bench to run (sender, receiver) pairs with the current remote write version (v1) and with the one this PR introduces (v11). Both senders are scraping over 100 pods from a real kubernetes namespace, which includes various Mimir microservices. I ran the benchmark for 2 hours.

image

These numbers show that string interning:

  • Greatly reduces senders' network egress (-47%) and receiver's CPU usage (-53%). See red boxes.
  • Reduces memory and CPU usage of senders, and does not increase memory usage of receivers. While these margins are lower (and random-ish) and are potentially not representative of a final version of the protocol, they show how the benefits mentioned above can be achieved without regressions, or even further benefits. See yellow boxes.

Go benchmark (old)

This benchmark was run using an old, early version of the PR. Keeping for future references.

Details

These benchmarks compare CPU and payload sizes of the current and the new protocol, including construction of the request, marshaling and compression as done by the queue manager. The benchmarks are run with different fixtures, that try to mimic X processes emitting Go runtime metrics (see createDummyTimeSeries).

$ go test -benchmem -run=^$ -bench "^BenchmarkBuild(Reduced)?WriteRequest$" github.com/prometheus/prometheus/
storage/remote
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/storage/remote
cpu: 12th Gen Intel(R) Core(TM) i7-12700
BenchmarkBuildWriteRequest/2_instances-20                  91128             12449 ns/op              1933 compressedSize/op          80 B/op          1 allocs/op
BenchmarkBuildWriteRequest/10_instances-20                 19920             59286 ns/op              8156 compressedSize/op          80 B/op          1 allocs/op
BenchmarkBuildWriteRequest/1k_instances-20                   148           7437633 ns/op            784083 compressedSize/op          80 B/op          1 allocs/op
BenchmarkBuildReducedWriteRequest/2_instances-20           59606             19980 ns/op              1786 compressedSize/op          96 B/op          1 allocs/op
BenchmarkBuildReducedWriteRequest/10_instances-20          15636             75689 ns/op              5897 compressedSize/op          96 B/op          1 allocs/op
BenchmarkBuildReducedWriteRequest/1k_instances-20            163           7334976 ns/op            518392 compressedSize/op          96 B/op          1 allocs/op
PASS
ok      github.com/prometheus/prometheus/storage/remote 10.929s

The trend goes as one would expect: with more repetition, the faster the new encoding and the smaller the new payload when compared with the original protocol. Remarkably, both CPU and payload sizes are worse if the request doesn't have a lot of repetition. To get a better understanding of how this behaves with a more realistic workload, there's the e2e benches.

e2e benchmark (old)

This benchmark was run using an old, early version of the PR. Keeping for future references.

Details

This benchmark runs two receivers and two senders, each sender/receiver pair running a different version: sender_1.0 => receiver_1.0 and sender_1.1 => receiver_1.1. Both senders are scrapping all the rest and themselves, as well as an entire real kubernetes namespace with more than 100 pods, including various services from Mimir.

TODO(@npazosmendez): share the util scripts to reproduce this benchmark with an arbitrary namespace.

Below are the values of some relevant counters after running everything for around 30 minutes:

CPU

  • process_cpu_seconds_total
  • sender-v1: 109.16
  • sender-v11: 109.13
  • receiver-v1: 164.45
  • receiver-v11: 127.49

Senders CPU is practically the same. Receiver using the new protocol uses ~25% less CPU.

image

Network

  • prometheus_remote_storage_bytes_total
  • sender-v1: 321159262
  • sender-v11: 243318645

Sender using the new protocol sent ~25% less data.

image

Mmory

Inuse heap size does not vary much (first screenshot is below). But the number of objects does change for the receivers, where the one running the new protocol is almost consistently lower (second screendshot), which should lower the stress on GC.
Also, the total bytes allocated is less:

  • go_memstats_alloc_bytes_total
  • sender-v1: 7394260880
  • sender-v11: 7374288920
  • receiver-v1: 47286295416
  • receiver-v11: 33116164600

Senders are practically the same. Receiver with new protocol allocated ~30% less bytes.

image

image

image

@npazosmendez npazosmendez force-pushed the alexnico-remote-write-1-1 branch from 7c008b9 to ebdf073 Compare October 30, 2023 21:49
@npazosmendez npazosmendez marked this pull request as ready for review November 2, 2023 13:21
Copy link
Contributor

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid question but wouldn't a generic compression algorithm like Snappy catch cases like this where the same string(-s) are repeated over and over again? Maybe it was mentioned somewhere 🤔

@bboreham
Copy link
Member

bboreham commented Nov 3, 2023

That's what we have today. However Snappy does not do a great job (it's designed to be fast rather than tight), and it creates overhead in the receiver who has to uncompress all the data.
If the receiver only gets one copy of each string this saves space and probably garbage.

@npazosmendez
Copy link
Contributor Author

We are experimenting with an alternative interning method that is giving better results. I will mark this as a draft in the meantime.

On a separate note: I'm working on some benchmarks to compare the current protocol + better compression algorithm vs this new proposed protocol. Will share those soon

@npazosmendez npazosmendez marked this pull request as draft November 7, 2023 20:58
@cstyan cstyan mentioned this pull request Nov 8, 2023
24 tasks
@npazosmendez npazosmendez force-pushed the alexnico-remote-write-1-1 branch from d3de9d7 to 46b84ab Compare November 9, 2023 13:18
@npazosmendez
Copy link
Contributor Author

Here's an experiment on string interning vs using other compression algorithms with the current protocol: #13105 (comment)

@npazosmendez npazosmendez marked this pull request as ready for review November 29, 2023 20:06
cstyan and others added 15 commits December 19, 2023 14:27
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
write request format

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
cstyan and others added 16 commits December 19, 2023 14:29
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
the tests fail to do `range labels.Labels` on CI

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
also adapt tests to the new format

Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
@npazosmendez npazosmendez force-pushed the alexnico-remote-write-1-1 branch from 4d9d3d9 to a8224cc Compare December 19, 2023 17:33
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
@npazosmendez npazosmendez force-pushed the alexnico-remote-write-1-1 branch from 25c752a to 48f9285 Compare December 21, 2023 13:03
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
@npazosmendez npazosmendez changed the title remote write 1.1: new proto format with string interning remote write 2.0: new proto format with string interning Dec 28, 2023
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
@npazosmendez npazosmendez force-pushed the alexnico-remote-write-1-1 branch from 3b0ce6a to acd0353 Compare December 28, 2023 15:28
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we agreed to merge it as-is and iterate, thanks for the initial cleanup!

@@ -432,6 +434,9 @@ func main() {
a.Flag("enable-feature", "Comma separated feature names to enable. Valid options: agent, exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-at-modifier, promql-negative-offset, promql-per-step-stats, promql-experimental-functions, remote-write-receiver (DEPRECATED), extra-scrape-metrics, new-service-discovery-manager, auto-gomaxprocs, no-default-scrape-port, native-histograms, otlp-write-receiver. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details.").
Default("").StringsVar(&cfg.featureList)

a.Flag("remote-write-format", "remote write proto format to use, valid options: 0 (1.0), 1 (reduced format), 3 (min64 format)").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can kill those later on

@npazosmendez npazosmendez merged commit 6a03f5a into prometheus:remote-write-1.1 Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants