Skip to content

Conversation

thampiotr
Copy link
Contributor

@thampiotr thampiotr commented Jul 29, 2025

The issue:

  • The buffers used in Protobuf parser are reused in scrape.go.
  • But we return an unsafe string for labels key and value in decoder.go
  • The labels are not copied and their underlying buffer gets reused, leading to data corruption.

Proposed fix: Use UnsafeAddBytes in builder to handle the unsafe string correctly in each labels builder implementation.

NOTE: Since bb76966 the stringlabels are default. This issue affects only slicelabels. Added the test to GH package.

To reproduce locally, run with slicelabels tag:

go test --tags=slicelabels ./prompb/io/prometheus/client/...
--- FAIL: TestMetricStreamingDecoder_LabelsCorruption (0.00s)
    decoder_test.go:212: 
                Error Trace:    /Users/thampiotr/workspace/prometheus/prompb/io/prometheus/client/decoder_test.go:212
                Error:          Should be true
                Test:           TestMetricStreamingDecoder_LabelsCorruption
                Messages:       encountered corrupted labels: {"\finstance_35"="\tvalue_85", "\finstance_47"="\tvalue_50"}
FAIL
FAIL    github.com/prometheus/prometheus/prompb/io/prometheus/client    0.231s
FAIL

@thampiotr thampiotr force-pushed the fix-memory-corruption-parser branch 2 times, most recently from d5659d0 to 36f05e4 Compare July 29, 2025 16:11
Signed-off-by: Piotr <17101802+thampiotr@users.noreply.github.com>
@thampiotr thampiotr force-pushed the fix-memory-corruption-parser branch 2 times, most recently from 9bc4fbf to 6c5ef02 Compare July 29, 2025 16:51
@bwplotka
Copy link
Member

bwplotka commented Jul 29, 2025

Nice catch! Your intuition makes sense, thanks!

Which one is the CHOSEN/better one in general slicelabels or stringlabels? I don't remember at this point @bboreham

If the stringlabels then simply copy for slicelabels here (only for that tag) would be preferable.

@bboreham
Copy link
Member

Which one is the CHOSEN/better one in general slicelabels or stringlabels?

Prometheus has built with "stringlabels" since v2.44 more than two years ago; it saves memory and runs faster. Since #16436 (three months ago) the code has used that mode by default.

For people using Prometheus code as a library, it depends how what kind of label strings they have. If the average name/value length is more than 15 bytes, and they don't have a lot of copies of the same string, then slicelabels will be smaller.

(There is also dedupelabels which makes "more intelligent" choices, but in practice it trades less memory for more cpu and that wasn't a worthwhile trade for my employers and I stopped working on it)

If the stringlabels then simply copy for slicelabels here (only for that tag) would be preferable.

That is what the UnsafeAddBytes method achieves, without burdening parsing code with build tags.

Signed-off-by: Piotr <17101802+thampiotr@users.noreply.github.com>
@thampiotr thampiotr force-pushed the fix-memory-corruption-parser branch from 6c5ef02 to be46837 Compare July 30, 2025 10:23
@thampiotr
Copy link
Contributor Author

thampiotr commented Jul 30, 2025

That is what the UnsafeAddBytes method achieves, without burdening parsing code with build tags.

I like this approach as it allows the implementation of *labels build tag to decide what's the best way to handle it. But I'm open to alternatives if you decide.

Just added the last comment that @bboreham recommended and so it's ready for review.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, let's go, thanks!!!

TIL UnsafeAddBytes, like the approach @bboreham 👍🏽

@@ -40,7 +40,7 @@ jobs:
- uses: prometheus/promci@443c7fc2397e946bc9f5029e313a9c3441b9b86d # v0.4.7
- uses: ./.github/promci/actions/setup_environment
- run: go test --tags=dedupelabels ./...
- run: go test --tags=slicelabels -race ./cmd/prometheus
- run: go test --tags=slicelabels -race ./cmd/prometheus ./prompb/io/prometheus/client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't do ./... here, not sure.. but I guess let's keep it this way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember. Seems reasonable to swap with dedupelabels which I don't know anyone is using.

@bwplotka bwplotka merged commit e35c09d into prometheus:main Jul 30, 2025
27 checks passed
@thampiotr thampiotr deleted the fix-memory-corruption-parser branch July 31, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants