Skip to content

Conversation

bwplotka
Copy link
Member

@bwplotka bwplotka commented Dec 30, 2024

Fixes #14668

Depends on #15966

Previous attempts #15726 #15729

Design

We have opportunity to stream protobuf unmarshalling. It's already sort of streamed by metric family, however no memory is reused across families. With this new implementation we:

  1. Have a single MetricSteamingDecoder that allows progressing per MetricFamily. It reuses previous MetricFamily object (e.g. internal bytes for lazy decoding of metrics).
  2. It also allows progressing per Metric, by unmarshalling MetricFamilies WITHOUT metrics array. We only save bytes position of each metric to unmarshal it later.
  3. When unmarshalling Metric we skip labels (we save position of each label in serialized proto for later). It offers separate Labels(*labels.ScratchBuilder) to parse and put labels directly into builder.
  4. All strings are yoloStrings, no copying. The only "copy" is for labels as we use them later in appending, so they can't be reused.

Maintenance

While it looks cumbersome to maintain, to make things efficient, one way or another we have to have streaming/lazy decoding with reusing of as much as possible. This does not depend on gogo we can do this on OpaqueAPI too but didn't want to change too much here. We could try to work with vtprotobuf or write our own plugin for those methods, but we can start by handcrafting now, this proto is not changing at all (if it will change in future, it will be OpenMetric proto likely).

@bwplotka

This comment was marked as outdated.

@bwplotka

This comment was marked as outdated.

@bwplotka bwplotka requested review from krajorama and bboreham January 3, 2025 10:46
Base automatically changed from scrapebench to main January 14, 2025 14:15
@bwplotka
Copy link
Member Author

Getting back to this after holidays...

@bwplotka bwplotka force-pushed the protoopt branch 2 times, most recently from 192d376 to 8333f3c Compare February 3, 2025 13:41
@bwplotka bwplotka marked this pull request as ready for review February 3, 2025 13:42
@bwplotka bwplotka force-pushed the protoopt branch 3 times, most recently from 91697a5 to 308d630 Compare February 3, 2025 14:47
@bwplotka
Copy link
Member Author

bwplotka commented Feb 3, 2025

This should be now good to go 🎉

cc @krajorama @beorn7 @bboreham

Results looks solid. There's room for further allocation improvements, but I propose we iterate. Some of those needs interface updates.

$ benchstat append-v1.txt append-v2.txt 
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/scrape
cpu: Apple M1 Pro
                                                     │ append-v1.txt │           append-v2.txt            │
                                                     │    sec/op     │   sec/op     vs base               │
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromText-2     280.4µ ± 10%   283.9µ ± 8%        ~ (p=0.937 n=6)
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=OMText-2       265.1µ ± 10%   271.7µ ± 7%        ~ (p=0.818 n=6)
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromProto-2    447.0µ ±  2%   299.8µ ± 7%  -32.93% (p=0.002 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromText-2     168.9µ ±  3%   168.6µ ± 2%        ~ (p=0.937 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=OMText-2       166.4µ ±  1%   167.5µ ± 2%   +0.68% (p=0.041 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromProto-2    159.0µ ±  3%   146.1µ ± 2%   -8.07% (p=0.002 n=6)
geomean                                                 230.1µ         213.8µ        -7.10%

                                                     │ append-v1.txt │            append-v2.txt            │
                                                     │     B/op      │     B/op      vs base               │
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromText-2    178.7Ki ±  8%   182.3Ki ± 6%        ~ (p=0.240 n=6)
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=OMText-2      170.5Ki ± 10%   172.3Ki ± 2%        ~ (p=0.937 n=6)
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromProto-2   665.7Ki ±  2%   212.5Ki ± 3%  -68.08% (p=0.002 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromText-2    74.02Ki ±  1%   73.53Ki ± 3%        ~ (p=0.589 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=OMText-2      72.87Ki ±  1%   73.53Ki ± 3%        ~ (p=0.485 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromProto-2   169.2Ki ±  1%   110.6Ki ± 5%  -34.62% (p=0.002 n=6)
geomean                                                162.6Ki         125.9Ki       -22.56%

                                                     │ append-v1.txt │            append-v2.txt            │
                                                     │   allocs/op   │ allocs/op   vs base                 │
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromText-2       11.00 ± 0%   11.00 ± 0%        ~ (p=1.000 n=6) ¹
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=OMText-2         12.00 ± 0%   12.00 ± 0%        ~ (p=1.000 n=6) ¹
ScrapeLoopAppend/data=1Fam1000Gauges/fmt=PromProto-2    8012.00 ± 0%   21.00 ± 0%  -99.74% (p=0.002 n=6)
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromText-2       18.00 ± 0%   18.00 ± 0%        ~ (p=1.000 n=6) ¹
ScrapeLoopAppend/data=59FamsAllTypes/fmt=OMText-2         19.00 ± 0%   19.00 ± 0%        ~ (p=1.000 n=6) ¹
ScrapeLoopAppend/data=59FamsAllTypes/fmt=PromProto-2     1693.0 ± 0%   642.0 ± 0%  -62.08% (p=0.002 n=6)
geomean                                                   92.15        29.11       -68.41%
¹ all samples are equal

@bwplotka bwplotka requested a review from beorn7 February 3, 2025 14:48
@bwplotka
Copy link
Member Author

bwplotka commented Feb 4, 2025

I have some ideas to optimize further (see https://pprof.me/14c6981f2f1f0ab8dce893729df1da67/?profileType=profile%253Aalloc_objects%253Acount%253Aspace%253Abytes&color_by=filename), but I will do those in the next iteration - this PR is already big.

@bwplotka
Copy link
Member Author

bwplotka commented Feb 4, 2025

/prombench help

@prombot
Copy link
Contributor

prombot commented Feb 4, 2025

Available Commands:

  • To start benchmark: /prombench <branch or git tag to compare with>
  • To restart benchmark: /prombench <branch or git tag to compare with>
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

Advanced Flags for start and restart Commands:

  • --bench.directory=<sub-directory of github.com/prometheus/test-infra/prombench
    • See the details here, defaults to manifests/prombench.
  • --bench.version=<branch | @commit>
    • See the details here, defaults to master.

Examples:

  • /prombench v3.0.0
  • /prombench v3.0.0 --bench.version=@aca1803ccf5d795eee4b0848707eab26d05965cc --bench.directory=manifests/prombench

@bwplotka
Copy link
Member Author

bwplotka commented Feb 4, 2025

/prombench main --bench.version=bench/protofirst

@prombot
Copy link
Contributor

prombot commented Feb 4, 2025

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-15731 and main

Custom benchmark version: bench/protofirst branch

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main --bench.version=bench/protofirst
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka
Copy link
Member Author

bwplotka commented Feb 4, 2025

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Feb 4, 2025

Benchmark cancel is in progress.

@bwplotka
Copy link
Member Author

bwplotka commented Feb 5, 2025

/prombench cancel

@bwplotka bwplotka requested a review from dgl as a code owner February 12, 2025 08:22
@bwplotka
Copy link
Member Author

Rebased on top of #16012 otherwise good to go cc @bboreham - thanks for amazing review! 💪🏽

@bwplotka
Copy link
Member Author

/prombench main --bench.version=bench/cross-feature/protofirst

@prombot
Copy link
Contributor

prombot commented Feb 12, 2025

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-15731 and main

Custom benchmark version: bench/cross-feature/protofirst branch

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main --bench.version=bench/cross-feature/protofirst
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

bwplotka added a commit that referenced this pull request Feb 12, 2025
Depends on #15731

Signed-off-by: bwplotka <bwplotka@gmail.com>
bwplotka added a commit that referenced this pull request Feb 12, 2025
Depends on #15731

Signed-off-by: bwplotka <bwplotka@gmail.com>
bwplotka added a commit that referenced this pull request Feb 12, 2025
Depends on #15731

Signed-off-by: bwplotka <bwplotka@gmail.com>
bwplotka added a commit that referenced this pull request Feb 12, 2025
Depends on #15731

Signed-off-by: bwplotka <bwplotka@gmail.com>
bwplotka added a commit that referenced this pull request Feb 12, 2025
Depends on #15731

Signed-off-by: bwplotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

Summary

The current state:

Microbenchmarks from #15731 (comment) are still valid.

Protobuf parsing is:

  • 15-40% faster (lower CPU time)
  • Allocates ~60% less memory.
  • Allocates 2x-700x less objects in memory.
  • It's the fastest (CPU time) scrape protocol now (~6% faster than PromText and OMText)
  • It's allocating ~30% memory than other protocols.

Macrobenchmarks from #15731 (comment) and #15731 (comment)

  • Slightly more allocs/s vs OpenMetrics Text
    image

  • Slightly higher CPU use vs OpenMetrics text (although significantly lower than proto parsing without this optimization on main). This is likely due to higher allocs/s (GC overhead).

image

  • RSS on par or lower:

image

Next steps

I propose we cache more in #16020 (more complex, this PR can be merged safely without #16020 )

@bwplotka
Copy link
Member Author

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Feb 12, 2025

Benchmark cancel is in progress.

bwplotka added a commit that referenced this pull request Feb 12, 2025
…(...) (#16012)

* model/textparse: Change parser interface Metric(...) string to Labels(...)

Simplified the interface given no one is using the return argument.
Renamed for clarity too.

Found and discussed #15731 (comment)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fixed comments; optimized not needed copy for om and text.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>

Update model/textparse/protobufparse.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Addressing comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

decoder: reuse histograms and summaries.

Signed-off-by: bwplotka <bwplotka@gmail.com>

optimize help returning (5% of mem utilization).

Signed-off-by: bwplotka <bwplotka@gmail.com>

Apply suggestions from code review

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Update prompb/io/prometheus/client/decoder.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Fix build.

Signed-off-by: bwplotka <bwplotka@gmail.com>
Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s give it a go.

kylestang pushed a commit to kylestang/prometheus that referenced this pull request Feb 12, 2025
…(...) (prometheus#16012)

* model/textparse: Change parser interface Metric(...) string to Labels(...)

Simplified the interface given no one is using the return argument.
Renamed for clarity too.

Found and discussed prometheus#15731 (comment)

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fixed comments; optimized not needed copy for om and text.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

/prombench main --bench.version=bench/protofirst

@prombot
Copy link
Contributor

prombot commented Feb 13, 2025

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-15731 and main

Custom benchmark version: bench/protofirst branch

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main --bench.version=bench/protofirst
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scrape (histograms): Investigate (and address) protobuf scraping performance problems
6 participants