-
Notifications
You must be signed in to change notification settings - Fork 9.8k
remote write 2.0: new proto format with string interning #13052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote write 2.0: new proto format with string interning #13052
Conversation
7c008b9
to
ebdf073
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid question but wouldn't a generic compression algorithm like Snappy catch cases like this where the same string(-s) are repeated over and over again? Maybe it was mentioned somewhere 🤔
That's what we have today. However Snappy does not do a great job (it's designed to be fast rather than tight), and it creates overhead in the receiver who has to uncompress all the data. |
We are experimenting with an alternative interning method that is giving better results. I will mark this as a draft in the meantime. On a separate note: I'm working on some benchmarks to compare the current protocol + better compression algorithm vs this new proposed protocol. Will share those soon |
d3de9d7
to
46b84ab
Compare
Here's an experiment on string interning vs using other compression algorithms with the current protocol: #13105 (comment) |
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
write request format Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
the tests fail to do `range labels.Labels` on CI Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Co-authored-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
also adapt tests to the new format Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
4d9d3d9
to
a8224cc
Compare
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
25c752a
to
48f9285
Compare
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
3b0ce6a
to
acd0353
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we agreed to merge it as-is and iterate, thanks for the initial cleanup!
@@ -432,6 +434,9 @@ func main() { | |||
a.Flag("enable-feature", "Comma separated feature names to enable. Valid options: agent, exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-at-modifier, promql-negative-offset, promql-per-step-stats, promql-experimental-functions, remote-write-receiver (DEPRECATED), extra-scrape-metrics, new-service-discovery-manager, auto-gomaxprocs, no-default-scrape-port, native-histograms, otlp-write-receiver. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details."). | |||
Default("").StringsVar(&cfg.featureList) | |||
|
|||
a.Flag("remote-write-format", "remote write proto format to use, valid options: 0 (1.0), 1 (reduced format), 3 (min64 format)"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can kill those later on
Part of #13105
Context
This PR is the first of the multiple PRs to come that will iteratively work on the new Remote Write 2.0 format, that we intend to keep in the
remote-write-1.1
feature branch. This PR in particular is focused on introducing string interning to the new format, and is a continuation of the work from #11999 which we iterated until the proposal of this PR.The goal behind interning is to reduce resource usage in both senders and receivers: network egress, CPU and potentially memory. There's the question of whether just changing the compression algorithm from Snappy to something else is sufficient, and would make this interning not necessary. We explored that question and concluded interning is still a major win, see this comment for more details.
As a separate note, we will be including interning of all string data as part of this format via follow up PRs. This will include exemplar labels as well as per-series metadata, rebasing #11640 on top of the changes in this PR and leveraging the symbols interning.
This PR combines work/help/suggestions from @cstyan , @alexgreenbank, @npazosmendez, @bwplotka , @pracucci, @bboreham
How the interning works
The interning technique is simple: a write request includes a request-level
[]string
with the purpose of de-duplicating strings across timeseries. Each timeseries encodes its labels as indices to that array, using a[]uint32
of the form<lbl1_key_index>,<lbl1_val_index>, <lbl2_key_index>,<lbl2_val_index>...
.See Prometheus Remote-Write 1.1 Specification for more details.
Benchmark
This benchmark uses the scripts in https://github.com/prometheus/prometheus/tree/remote-write-1.1/scripts/remotewrite11-bench to run (sender, receiver) pairs with the current remote write version (
v1
) and with the one this PR introduces (v11
). Both senders are scraping over 100 pods from a real kubernetes namespace, which includes various Mimir microservices. I ran the benchmark for 2 hours.These numbers show that string interning:
Go benchmark (old)This benchmark was run using an old, early version of the PR. Keeping for future references.
Details
These benchmarks compare CPU and payload sizes of the current and the new protocol, including construction of the request, marshaling and compression as done by the queue manager. The benchmarks are run with different fixtures, that try to mimic X processes emitting Go runtime metrics (see createDummyTimeSeries).
The trend goes as one would expect: with more repetition, the faster the new encoding and the smaller the new payload when compared with the original protocol. Remarkably, both CPU and payload sizes are worse if the request doesn't have a lot of repetition. To get a better understanding of how this behaves with a more realistic workload, there's the e2e benches.
e2e benchmark (old)This benchmark was run using an old, early version of the PR. Keeping for future references.
Details
This benchmark runs two receivers and two senders, each sender/receiver pair running a different version:
sender_1.0 => receiver_1.0
andsender_1.1 => receiver_1.1
. Both senders are scrapping all the rest and themselves, as well as an entire real kubernetes namespace with more than 100 pods, including various services from Mimir.TODO(@npazosmendez): share the util scripts to reproduce this benchmark with an arbitrary namespace.
Below are the values of some relevant counters after running everything for around 30 minutes:
CPU
process_cpu_seconds_total
sender-v1
: 109.16sender-v11
: 109.13receiver-v1
: 164.45receiver-v11
: 127.49Senders CPU is practically the same. Receiver using the new protocol uses ~25% less CPU.
Network
prometheus_remote_storage_bytes_total
sender-v1
: 321159262sender-v11
: 243318645Sender using the new protocol sent ~25% less data.
Mmory
Inuse heap size does not vary much (first screenshot is below). But the number of objects does change for the receivers, where the one running the new protocol is almost consistently lower (second screendshot), which should lower the stress on GC.
Also, the total bytes allocated is less:
go_memstats_alloc_bytes_total
sender-v1
: 7394260880sender-v11
: 7374288920receiver-v1
: 47286295416receiver-v11
: 33116164600Senders are practically the same. Receiver with new protocol allocated ~30% less bytes.