scrape: Add config option for escaping scheme request. #16066

ywwg · 2025-02-21T18:27:23Z

The new metric_name_escaping_scheme config option works in parallel with metric_name_validation_scheme and controls which escaping scheme is requested when scraping. When not specified, the scheme will request underscores if the validation scheme is set to legacy, and will request allow-utf-8 when the validation scheme is set to utf8. This setting allows users to allow utf8 names if they like, but explicitly request an escaping scheme rather than UTF-8.

Fixes #16034

Built on #16080

scrape/scrape.go

bwplotka

Thanks! Some questions and thoughts to progress here. It makes sense to add override for escape, but not sure of the choice of defaults here. Again, for the record, I am not sure anyone would like their metrics to be automatically changed, any time, especially dynamically, in a hidden way on SDK level. Some expert eyes on this like @fstab would be amazing.

I noticed Prometheus now always have model.NameValidationScheme set to UTF-8 AFAIK; why we keep validating this?

As per other changes -- before this change (and in 3.x?) we used to have:

config MetricNameValidationScheme is empty by default
If config MetricNameValidationScheme == legacy:
- We do legacy validation
- We don't send the header (we did for PromText 1.0 accidently)
Else
- We do UTF-8 validation
- We send escape=AllowUTF-8 on all scrape protocols

Now:

Since model.NameValidationScheme is always UTF-8, the default means UTF-8 for config MetricNameValidationScheme, same as now too.
if config MetricNameValidationScheme is set to legacy we do underscore escaping (breaking, we used to do no escaping header).
else no escaping so `escaping="Allow-UTF" as it was

Then users can override escaping..

Is that right?

On a quick glance, should we:

Kill all validation or mention,defaulting to different model.NameValidationScheme. It should be always UTF-8 now? We can set explicit global in main.go too?

Then, it's convoluted but:

Do we have to break API and do underscore escaping by default on legacy validation? Can we just fail? (it's what we do now?)
We add AllowUTF-8 escaping for UTF-8 validation (so always, by default). Why? What it changes for SDKs actually? (ignorant question likely, sorry, trying to understand).

ywwg · 2025-02-24T14:55:27Z

if config MetricNameValidationScheme is set to legacy we do underscore escaping (breaking, we used to do no escaping header).

this is not technically breaking since sending no escaping header asks the common library to apply its default escaping method, which is hard-coded to underscores right now. Hypothetically someone could override that global in common, but I do not think anyone has ever done that.

ywwg · 2025-02-24T14:57:47Z

Kill all validation or mention,defaulting to different model.NameValidationScheme. It should be always UTF-8 now? We can set explicit global in main.go too?

I think there's a good argument for this. Even with the default ValidationScheme set to UTF-8, it is possible to configure a datasource for legacy validation.

ywwg · 2025-02-24T15:08:16Z

We add AllowUTF-8 escaping for UTF-8 validation (so always, by default). Why? What it changes for SDKs actually? (ignorant question likely, sorry, trying to understand).

I'm not quite sure I understand the question. "allow-utf-8" is a content negotiation header, which is part of Prometheus itself, not part of the SDK. The Common SDK only provides a way to instantiate metrics and a single bare function to validate that their names are correct. Because the SDK is very plain, it has no state to know whether a set of metrics are UTF-8 or Legacy. That's why we had to create the global variable, so that the existing ecosystem of clients would not have to change every validation callsite. But there is a new API in the SDK for asking if a metric is specifically Legacy-valid, and that's the API Prometheus uses when an individual datasource is set to legacy mode. So Prometheus itself does not need the common library global variable, because it has been updated to properly support UTF-8. Other code that uses the common SDK could be similarly updated, but until that happens, if we get rid of the global than other code may break because it is not UTF-8 aware.

bwplotka · 2025-02-24T17:17:51Z

I think there's a good argument for this. Even with the default ValidationScheme set to UTF-8, it is possible to configure a datasource for legacy validation

It is possible, but that global is always set to utf-8 so we can kill all the code that depends on model.NameValidationScheme and simplify all logic, no?

If someone specifies scraping as legacy all will work correctly, we still don't change this global.

bwplotka · 2025-02-24T17:40:35Z

I'm not quite sure I understand the question. "allow-utf-8" is a content negotiation header, which is part of Prometheus itself, not part of the SDK. The Common SDK only provides a way to instantiate metrics and a single bare function to validate that their names are correct. Because the SDK is very plain, it has no state to know whether a set of metrics are UTF-8 or Legacy. That's why we had to create the global variable, so that the existing ecosystem of clients would not have to change every validation callsite. But there is a new API in the SDK for asking if a metric is specifically Legacy-valid, and that's the API Prometheus uses when an individual datasource is set to legacy mode. So Prometheus itself does not need the common library global variable, because it has been updated to properly support UTF-8. Other code that uses the common SDK could be similarly updated, but until that happens, if we get rid of the global than other code may break because it is not UTF-8 aware.

I think we misunderstood each other here, around the global. I meant to remove the use of global in Prometheus only.

As per my main question I found the answer in the new code I was not aware of in expfmt: https://github.com/prometheus/common/blob/main/expfmt%2Ftext_create.go#L454

And that expfmt dynamicity with names scares me, I think we should reject if we cannot accept utf vs dynamic encoding and it's sad we maintain client_golang and this change slipped through the common update, without our explicit approval. This shouldn't be added in the first place IMO. We are adding too much complexity for too niche use cases

To sum up, let's do this encoding option asap for people to be able to unblock themselves (just let's please simplify the validation and ifs) 🙈 - I think you are right, not breaking change then, but this situation is not ideal.

What would take to have NO encoding meaning "allow-utf-8" semantics (no name escaping)? I guess we wait for 2.0 to allow this?

ywwg · 2025-02-25T15:54:19Z

I think we misunderstood each other here, around the global. I meant to remove the use of global in Prometheus only.

Ah, got it. Yes I think we can remove it in prom. It's not in too many places so that shouldn't be hard.

As per my main question I found the answer in the new code I was not aware of in expfmt: prometheus/common@main/expfmt%2Ftext_create.go#L454

And that expfmt dynamicity with names scares me, I think we should reject if we cannot accept utf vs dynamic encoding and it's sad we maintain client_golang and this change slipped through the common update, without our explicit approval. This shouldn't be added in the first place IMO. We are adding too much complexity for too niche use cases

To sum up, let's do this encoding option asap for people to be able to unblock themselves (just let's please simplify the validation and ifs) 🙈 - I think you are right, not breaking change then, but this situation is not ideal.

Mostly the if statements are to cover all of the possible permutations of settings... We could start by removing switching based on model.NameValidationScheme and simply panic if it has been set to Legacy. That would make the escaping request code simpler.

What would take to have NO encoding meaning "allow-utf-8" semantics (no name escaping)? I guess we wait for 2.0 to allow this?

It depends if we care about preserving backward compatibility. Do we want to suddenly start serving UTF-8 to old versions of prometheus running on some raspberry pi that never gets updated? Suddenly scrapes would be in the new exposition format and the old code would reject the scrape, and there would be no way to configure the old prometheus to request an escaped format. My goal was to make sure old code would get the old format. (I suppose the user could set up legacy mode on the other end in this instance)

bwplotka · 2025-02-25T17:44:40Z

It depends if we care about preserving backward compatibility. Do we want to suddenly start serving UTF-8 to old versions of prometheus running on some raspberry pi that never gets updated? Suddenly scrapes would be in the new exposition format and the old code would reject the scrape, and there would be no way to configure the old prometheus to request an escaped format.

Got it, thanks. I have bunch of counter arguments, but there's no point, the decisions were made. Hopefully with OM 2.0 we can have none of this escaping magic. Perhaps it's expected that with OM 1.0 with UTF-8 restrictions you have to poke it with extra escaping header piece to trigger non escaping. Thanks, LGTM but I would suggest simplifying the logic with hardcoded support of UTF-8 in Prometheus (assumption that global is not changing).

ywwg · 2025-02-25T18:09:37Z

none of this escaping magic

long term, I do think the correct approach is to have the option on the metrics-producer side. The scraper should be able to accept anything. We can try to push things in that direction.

ywwg · 2025-03-12T18:34:34Z

rebased against the cleanup PR

ywwg · 2025-03-12T19:13:07Z

not sure what is wrong with the windows tests

Signed-off-by: Owen Williams <owen.williams@grafana.com>

bwplotka

Thank you! Much better! Some suggestions inlined, but tl;dr:

We can simplify even more and move all validation and defaulting to a single place e.g. config layer.
We also need docs.
There might be validation gap/bug.

Otherwise looks solid!

cmd/prometheus/main.go

config/config.go

scrape/scrape.go

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg · 2025-03-18T21:18:18Z

as mentioned in chat, I don't love turning the schemes into required arguments in the config object. But now that the config is used to create the loop/pool, we could move the default-handling back to scrape.go and put it in newscrapepool/reload.

scrape/scrape.go

scrape/scrape_test.go

scrape/scrape.go

bwplotka

Thanks! Much better, only two nits, LGTM!

…s/escapeconfig

Signed-off-by: Owen Williams <owen.williams@grafana.com>

This is a fix for documentation update done in prometheus#16066, setting correct name for a configuration value. Signed-off-by: Martin Danko <46035688+zepellin@users.noreply.github.com>

ywwg force-pushed the owilliams/escapeconfig branch from 2420a0b to 73ae88d Compare February 21, 2025 18:27

bwplotka reviewed Feb 24, 2025

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

bwplotka reviewed Feb 24, 2025

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

bwplotka reviewed Feb 24, 2025

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

bwplotka reviewed Feb 24, 2025

View reviewed changes

ywwg mentioned this pull request Mar 3, 2025

feat: Create way to configure explicit escaping mode other than UTF-8 or "default" #16034

Closed

3 tasks

ywwg changed the base branch from main to owilliams/escapeconfig-00-cleanup March 12, 2025 18:32

ywwg marked this pull request as ready for review March 12, 2025 18:34

ywwg force-pushed the owilliams/escapeconfig branch from d45a32f to 7274a97 Compare March 12, 2025 18:43

ywwg force-pushed the owilliams/escapeconfig-00-cleanup branch from db82701 to 94b43c5 Compare March 13, 2025 14:48

ywwg requested review from dgl, roidelapluie, cstyan and tomwilkie as code owners March 13, 2025 14:48

Base automatically changed from owilliams/escapeconfig-00-cleanup to main March 13, 2025 18:21

ywwg force-pushed the owilliams/escapeconfig branch from 7274a97 to 6956644 Compare March 13, 2025 18:33

scrape: Add config option for escaping scheme request

bb79f20

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg force-pushed the owilliams/escapeconfig branch from 6956644 to bb79f20 Compare March 13, 2025 18:34

ywwg added area/utf8 Issues pertaining to UTF-8 support component/scraping labels Mar 13, 2025

ywwg requested a review from bwplotka March 14, 2025 13:41

bwplotka requested changes Mar 18, 2025

View reviewed changes

ywwg added 5 commits March 18, 2025 11:03

notes

79d5afd

Signed-off-by: Owen Williams <owen.williams@grafana.com>

doc

3c591be

Signed-off-by: Owen Williams <owen.williams@grafana.com>

more doc

ffbab9b

Signed-off-by: Owen Williams <owen.williams@grafana.com>

move some things around

2c09c18

Signed-off-by: Owen Williams <owen.williams@grafana.com>

Merge remote-tracking branch 'origin/main' into owilliams/escapeconfig

aa1abc4

ywwg requested a review from bwplotka March 19, 2025 13:05

ywwg commented Mar 19, 2025

View reviewed changes

scrape/scrape.go Show resolved Hide resolved

ywwg commented Mar 19, 2025

View reviewed changes

scrape/scrape_test.go Show resolved Hide resolved

ArthurSens mentioned this pull request Mar 26, 2025

Changed behaviour of resource generation in Prometheus receiver since upgrade to Prometheus 3.X open-telemetry/opentelemetry-collector-contrib#38097

Closed

bwplotka reviewed Mar 26, 2025

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

bwplotka reviewed Mar 26, 2025

View reviewed changes

scrape/scrape.go Outdated Show resolved Hide resolved

bwplotka approved these changes Mar 26, 2025

View reviewed changes

ywwg added 4 commits March 26, 2025 15:29

Merge branch 'main' of github.com:prometheus/prometheus into owilliam…

20a3e09

…s/escapeconfig

notes and nits

29d6602

Signed-off-by: Owen Williams <owen.williams@grafana.com>

lint

53715a3

Signed-off-by: Owen Williams <owen.williams@grafana.com>

nit

1a40288

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg merged commit 6566c5a into main Mar 26, 2025
44 of 45 checks passed

ywwg deleted the owilliams/escapeconfig branch March 26, 2025 22:27

charleskorn mentioned this pull request Mar 30, 2025

Sync upstream at 74a1d96 grafana/mimir-prometheus#862

Merged

zepellin mentioned this pull request May 7, 2025

docs: Fix metric_name_escaping_scheme config parameter #16563

Merged

scrape: Add config option for escaping scheme request. #16066

scrape: Add config option for escaping scheme request. #16066

Uh oh!

Conversation

ywwg commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwplotka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywwg commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywwg commented Feb 24, 2025

Uh oh!

ywwg commented Feb 24, 2025

Uh oh!

bwplotka commented Feb 24, 2025

Uh oh!

bwplotka commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywwg commented Feb 25, 2025

Uh oh!

bwplotka commented Feb 25, 2025

Uh oh!

ywwg commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywwg commented Mar 12, 2025

Uh oh!

ywwg commented Mar 12, 2025

Uh oh!

bwplotka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ywwg commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ywwg commented Feb 21, 2025 •

edited

Loading

bwplotka left a comment •

edited

Loading

ywwg commented Feb 24, 2025 •

edited

Loading

bwplotka commented Feb 24, 2025 •

edited

Loading

ywwg commented Feb 25, 2025 •

edited

Loading

bwplotka left a comment •

edited

Loading