-
Notifications
You must be signed in to change notification settings - Fork 9.7k
scrape: Add config option for escaping scheme request. #16066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2420a0b
to
73ae88d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some questions and thoughts to progress here. It makes sense to add override for escape, but not sure of the choice of defaults here. Again, for the record, I am not sure anyone would like their metrics to be automatically changed, any time, especially dynamically, in a hidden way on SDK level. Some expert eyes on this like @fstab would be amazing.
I noticed Prometheus now always have model.NameValidationScheme
set to UTF-8 AFAIK; why we keep validating this?
As per other changes -- before this change (and in 3.x?) we used to have:
- config
MetricNameValidationScheme
is empty by default - If config
MetricNameValidationScheme
== legacy:- We do legacy validation
- We don't send the header (we did for PromText 1.0 accidently)
- Else
- We do UTF-8 validation
- We send escape=AllowUTF-8 on all scrape protocols
Now:
- Since
model.NameValidationScheme
is always UTF-8, the default means UTF-8 for configMetricNameValidationScheme
, same as now too. - if config
MetricNameValidationScheme
is set to legacy we do underscore escaping (breaking, we used to do no escaping header). - else no escaping so `escaping="Allow-UTF" as it was
Then users can override escaping..
Is that right?
On a quick glance, should we:
- Kill all validation or mention,defaulting to different
model.NameValidationScheme
. It should be always UTF-8 now? We can set explicit global in main.go too?
Then, it's convoluted but:
- Do we have to break API and do underscore escaping by default on legacy validation? Can we just fail? (it's what we do now?)
- We add
AllowUTF-8
escaping for UTF-8 validation (so always, by default). Why? What it changes for SDKs actually? (ignorant question likely, sorry, trying to understand).
this is not technically breaking since sending no escaping header asks the common library to apply its default escaping method, which is hard-coded to underscores right now. Hypothetically someone could override that global in common, but I do not think anyone has ever done that. |
I think there's a good argument for this. Even with the default ValidationScheme set to UTF-8, it is possible to configure a datasource for legacy validation. |
I'm not quite sure I understand the question. "allow-utf-8" is a content negotiation header, which is part of Prometheus itself, not part of the SDK. The Common SDK only provides a way to instantiate metrics and a single bare function to validate that their names are correct. Because the SDK is very plain, it has no state to know whether a set of metrics are UTF-8 or Legacy. That's why we had to create the global variable, so that the existing ecosystem of clients would not have to change every validation callsite. But there is a new API in the SDK for asking if a metric is specifically Legacy-valid, and that's the API Prometheus uses when an individual datasource is set to legacy mode. So Prometheus itself does not need the common library global variable, because it has been updated to properly support UTF-8. Other code that uses the common SDK could be similarly updated, but until that happens, if we get rid of the global than other code may break because it is not UTF-8 aware. |
It is possible, but that global is always set to utf-8 so we can kill all the code that depends on model.NameValidationScheme and simplify all logic, no? If someone specifies scraping as legacy all will work correctly, we still don't change this global. |
I think we misunderstood each other here, around the global. I meant to remove the use of global in Prometheus only. As per my main question I found the answer in the new code I was not aware of in expfmt: https://github.com/prometheus/common/blob/main/expfmt%2Ftext_create.go#L454 And that expfmt dynamicity with names scares me, I think we should reject if we cannot accept utf vs dynamic encoding and it's sad we maintain client_golang and this change slipped through the common update, without our explicit approval. This shouldn't be added in the first place IMO. We are adding too much complexity for too niche use cases To sum up, let's do this encoding option asap for people to be able to unblock themselves (just let's please simplify the validation and ifs) 🙈 - I think you are right, not breaking change then, but this situation is not ideal. What would take to have NO encoding meaning "allow-utf-8" semantics (no name escaping)? I guess we wait for 2.0 to allow this? |
Ah, got it. Yes I think we can remove it in prom. It's not in too many places so that shouldn't be hard.
Mostly the if statements are to cover all of the possible permutations of settings... We could start by removing switching based on model.NameValidationScheme and simply panic if it has been set to Legacy. That would make the escaping request code simpler.
It depends if we care about preserving backward compatibility. Do we want to suddenly start serving UTF-8 to old versions of prometheus running on some raspberry pi that never gets updated? Suddenly scrapes would be in the new exposition format and the old code would reject the scrape, and there would be no way to configure the old prometheus to request an escaped format. My goal was to make sure old code would get the old format. (I suppose the user could set up legacy mode on the other end in this instance) |
Got it, thanks. I have bunch of counter arguments, but there's no point, the decisions were made. Hopefully with OM 2.0 we can have none of this escaping magic. Perhaps it's expected that with OM 1.0 with UTF-8 restrictions you have to poke it with extra |
long term, I do think the correct approach is to have the option on the metrics-producer side. The scraper should be able to accept anything. We can try to push things in that direction. |
rebased against the cleanup PR |
d45a32f
to
7274a97
Compare
not sure what is wrong with the windows tests |
db82701
to
94b43c5
Compare
7274a97
to
6956644
Compare
Signed-off-by: Owen Williams <owen.williams@grafana.com>
6956644
to
bb79f20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Much better! Some suggestions inlined, but tl;dr:
- We can simplify even more and move all validation and defaulting to a single place e.g. config layer.
- We also need docs.
- There might be validation gap/bug.
Otherwise looks solid!
Signed-off-by: Owen Williams <owen.williams@grafana.com>
as mentioned in chat, I don't love turning the schemes into required arguments in the config object. But now that the config is used to create the loop/pool, we could move the default-handling back to scrape.go and put it in newscrapepool/reload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Much better, only two nits, LGTM!
Signed-off-by: Owen Williams <owen.williams@grafana.com>
This is a fix for documentation update done in prometheus#16066, setting correct name for a configuration value. Signed-off-by: Martin Danko <46035688+zepellin@users.noreply.github.com>
The new metric_name_escaping_scheme config option works in parallel with metric_name_validation_scheme and controls which escaping scheme is requested when scraping. When not specified, the scheme will request underscores if the validation scheme is set to legacy, and will request allow-utf-8 when the validation scheme is set to utf8. This setting allows users to allow utf8 names if they like, but explicitly request an escaping scheme rather than UTF-8.
Fixes #16034
Built on #16080