Skip to content

Conversation

bwplotka
Copy link
Member

Fixes #14218 and #14220

Rebased version of @ridwanmsharif #15254 with improvements.

This change does the following:

  • Change appender interface to be CT aware (optional CT)
  • Add created-timestamp-per-sample feature flag
  • Add new sample record used only if CT is appended with the sample.
  • Remote Write awareness of CT.

@bwplotka
Copy link
Member Author

/prombench main --bench.version=bench/cross-feature/ct-per-sample

@prombot
Copy link
Contributor

prombot commented Feb 17, 2025

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-16046 and main

Custom benchmark version: bench/cross-feature/ct-per-sample branch

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main --bench.version=bench/cross-feature/ct-per-sample
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka bwplotka marked this pull request as ready for review February 17, 2025 15:01
@bwplotka
Copy link
Member Author

Functionality

Ok, it seems working for majority of incoming cumulatives. Not for native histograms so this is why there are still ones without CTs:

image

Efficiency

In terms of perf (for samples), we see some overhead:

A lot more allocs/s

image

Slight CPU increase

image

Up to 115ms vs 80ms latency spikes on query_range:

image

image

Slight increase in RSS:

image

Weird stuff:

I see odd notification/live long API call, not present on main...

image

pprof

@bwplotka
Copy link
Member Author

bwplotka commented Feb 17, 2025

Huh, so based on the profiles majority of allocated objects (other than querying) comes from protobuf parser still (magic labels) and CreatedTimestamp validation (and error?).

No error logs on Prometheus though.

image

@bwplotka
Copy link
Member Author

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Feb 17, 2025

Benchmark cancel is in progress.

@bwplotka
Copy link
Member Author

No regression in terms of memory on query side. Majority of allocs comes from proto parsing still. We might want to change benchmark scenario to include proto parsing for both...

@bwplotka
Copy link
Member Author

Decoding is heavy though (from WAL watcher), which is 2x heavier for some reason:

image

vs main

image

... oh I think we have to update our calculation here:

image

@bwplotka
Copy link
Member Author

In terms of objects

image

@cstyan
Copy link
Member

cstyan commented Feb 25, 2025

RW package changes look fine 👍

Re: the allocations issue, you're suggesting that the min size needs to account for the additional integer?
dec.Len() / (1 + 1 + 1 + 8); cap(samples) < minSize?

@bwplotka
Copy link
Member Author

It feels so, yea. I want to repeat the experiment with both versions doing protobuf though for a fair experiment.

@bwplotka
Copy link
Member Author

Next steps:

  • Redo benchmark with both using protobuf parsing, double check interesting query timing results.
  • Review this PR, especially how I decided to form this flag.
  • Connect with people who are working on OTLP detlas in Prometheus, because I think they 1:1 use this feature cc @ArthurSens -- especially if we go this route with detlas, is there anything worth changing in the struct/WAL format while we are at it. For example, we could consider renaming "Created Timestamp" fields into something like "Start time"? although we pushed for created timestamp naming already and maybe it's fine.

Fixes #14218 and #14220

Rebased version of #15254 with improvements.

This change does the following:
- Change appender interface to be CT aware (optional CT)
- Add created-timestamp-per-sample feature flag
- Add new sample record used only if CT is appended with the sample.
- Remote Write awareness of CT.

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>

# Conflicts:
#	cmd/prometheus/main.go
#	scrape/helpers_test.go
#	storage/remote/write_handler_test.go
@bwplotka bwplotka force-pushed the ctwal branch 2 times, most recently from d16b65b to a736e37 Compare February 28, 2025 15:59
@bwplotka bwplotka force-pushed the ctwal branch 5 times, most recently from 2b310a0 to ddda640 Compare March 3, 2025 09:50
Signed-off-by: bwplotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

bwplotka commented Mar 3, 2025

Retrying benchmark after optimizations in watcher. Custom scenario have one update too -- proto parsing is now enabled on both versions of Prometheus.

/prombench restart main --bench.version=bench/cross-feature/ct-per-sample

@bwplotka
Copy link
Member Author

bwplotka commented Mar 3, 2025

/prombench start main --bench.version=bench/cross-feature/ct-per-sample

@prombot
Copy link
Contributor

prombot commented Mar 3, 2025

Incorrect /prombench syntax; command requires one argument that matches (master|main|v[0-9]+\.[0-9]+\.[0-9]+\S*) regex.

Available Commands:

  • To start benchmark: /prombench <branch or git tag to compare with>
  • To restart benchmark: /prombench <branch or git tag to compare with>
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

Advanced Flags for start and restart Commands:

  • --bench.directory=<sub-directory of github.com/prometheus/test-infra/prombench
    • See the details here, defaults to manifests/prombench.
  • --bench.version=<branch | @commit>
    • See the details here, defaults to master.

Examples:

  • /prombench v3.0.0
  • /prombench v3.0.0 --bench.version=@aca1803ccf5d795eee4b0848707eab26d05965cc --bench.directory=manifests/prombench

@bwplotka
Copy link
Member Author

bwplotka commented Mar 3, 2025

/prombench main --bench.version=bench/cross-feature/ct-per-sample

@prombot
Copy link
Contributor

prombot commented Mar 3, 2025

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-16046 and main

Custom benchmark version: bench/cross-feature/ct-per-sample branch

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main --bench.version=bench/cross-feature/ct-per-sample
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka
Copy link
Member Author

bwplotka commented Mar 3, 2025

Thanks to decoder/watcher fixes I see significant improvement in RSS and allocs/s vs main WITHOUT CTs in WAL 😱

image

CPU etc looks on par

Query engine timing is worse - not sure why:

image

CPU wise it looks like GC is more intensive, querying is actually using more CPU on main 🤔 https://pprof.me/33106dee65aec562860070b28ed2d895

Looks like simply more objects are allocated https://pprof.me/c23cf4fda6e7be3ca34d292a30fc5f13
Total memory is seems better on this PR: https://pprof.me/0c232207d6fd0f7e461054bb972ebbb0

@bwplotka
Copy link
Member Author

bwplotka commented Mar 7, 2025

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Mar 7, 2025

Benchmark cancel is in progress.

@bwplotka bwplotka marked this pull request as draft March 7, 2025 12:52
@bwplotka
Copy link
Member Author

bwplotka commented Mar 7, 2025

Adding a few PRs first to split this change into smaller parts.

@bwplotka
Copy link
Member Author

bwplotka commented Mar 7, 2025

e.g. #16072, #16156, #16182

@github-actions github-actions bot added the stale label May 7, 2025
@bwplotka
Copy link
Member Author

bwplotka commented Jul 2, 2025

TODO:

The main non-trivial decision is about that WAL format. In this PR I propose CT next to every sample value and timestamp (all is diff-encoded) and it was showing not that big overhead, but we have to convince others.

The alternative is to perhaps design something similar to native histograms records, so create a record per different CT.

cc @ridwanmsharif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Created Timestamp: Store CT in WAL
4 participants