Skip to content

Conversation

ywwg
Copy link
Member

@ywwg ywwg commented Aug 21, 2024

This change causes Prometheus to allow all UTF-8 characters in metric and label names. This means that names that were previously invalid and would have been previously rejected will be allowed through.

Updates a number of tests to make sure we are still testing legacy behavior where needed.

part of #13095
fixes #14674

@ywwg ywwg force-pushed the owilliams/default-on branch 2 times, most recently from 5d0dcde to 91f20f9 Compare August 21, 2024 14:36
@ywwg ywwg changed the base branch from main to release-3.0 August 21, 2024 14:37
@ywwg ywwg marked this pull request as draft August 21, 2024 14:37
@ywwg ywwg force-pushed the owilliams/default-on branch from 91f20f9 to d640d3f Compare August 21, 2024 14:38
@ywwg ywwg marked this pull request as ready for review August 21, 2024 14:39
@ywwg ywwg marked this pull request as draft August 21, 2024 15:11
@ywwg
Copy link
Member Author

ywwg commented Aug 21, 2024

need to fix some tests

@ywwg ywwg marked this pull request as ready for review August 23, 2024 15:05
@ywwg ywwg requested a review from dgl as a code owner August 23, 2024 15:05
Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, but I am not sure about having legacy-names in enable-feature. Left a comment about it below.

case "utf8-names":
model.NameValidationScheme = model.UTF8Validation
level.Info(logger).Log("msg", "Experimental UTF-8 support enabled")
case "legacy-names":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean 2.x has --enable-feature=utf8-names and 3.x will have --enable-feature=legacy-names with utf8 on by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.x will have a feature called utf8-names available but it will not be set by default. 3.x will not have a feature flag for it, utf8 will be default-on and it can be disabled through the global or scrapeconfig flags

@@ -461,7 +463,7 @@ func main() {
a.Flag("scrape.discovery-reload-interval", "Interval used by scrape manager to throttle target groups updates.").
Hidden().Default("5s").SetValue(&cfg.scrape.DiscoveryReloadInterval)

a.Flag("enable-feature", "Comma separated feature names to enable. Valid options: agent, auto-gomemlimit, exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-per-step-stats, promql-experimental-functions, extra-scrape-metrics, new-service-discovery-manager, auto-gomaxprocs, no-default-scrape-port, native-histograms, otlp-write-receiver, created-timestamp-zero-ingestion, concurrent-rule-eval. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details.").
a.Flag("enable-feature", "Comma separated feature names to enable. Valid options: agent, auto-gomemlimit, exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-per-step-stats, promql-experimental-functions, extra-scrape-metrics, new-service-discovery-manager, auto-gomaxprocs, no-default-scrape-port, native-histograms, otlp-write-receiver, created-timestamp-zero-ingestion, concurrent-rule-eval, legacy-names. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details.").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable-feature is for experimental features. legacy-names is neither breaking nor experimental. I think this should be a flag of its own or something else. Another argument can be made that in the future we want to always have utf8 enabled in Prometheus and remove the legacy-names, in which case feature flags are better. @beorn7 WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the below comment; we figured we don't need the flag afterall

Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some offline chat with @ywwg we figured we don't need the flag at all. The config file has two levels to manage the utf8 - global and at scrape level. Global config already controls the utf8 for all the scrapes, so the flag is redundant. utf8 will be on by default, and users can simply set the global config to legacy when migrating to 3.0.

We need tests to check the scrape level config overriding global config when set.

@ywwg ywwg force-pushed the owilliams/default-on branch from cbfc63a to fd6c2ee Compare August 28, 2024 19:20
@ywwg ywwg requested a review from codesome August 29, 2024 13:28
config/config.go Outdated
if model.NameValidationScheme != model.UTF8Validation {
return fmt.Errorf("utf8 name validation requested but feature not enabled via --enable-feature=utf8-names")
return fmt.Errorf("utf8 name validation requested but validation mode is not set to UTF8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what we want, is it?

We want the user to be able to specify legacy validation globally, but then this error would fire.

I think we only need to check the case that an invalid validation scheme is provided.

This has implications for the tests below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see @codesome 's comment: #14705 (review)

Legacy mode should only be set in the global config or scrape config, while the common library itself always needs to be set to validate UTF-8 by default or those settings won't work. This conditional is a check that there is not a programming error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Of course, you are not comparing to the config setting here.

In that case, let's just panic. That's fine (or actually most appropriate) for a programming error. Returning an error from the config validation certainly suggests there is an error in the config (not in the code).

@ywwg ywwg force-pushed the owilliams/default-on branch from 6e4311d to 7b408ad Compare September 5, 2024 19:11
@ywwg ywwg requested a review from beorn7 September 5, 2024 19:19
This change causes Prometheus to allow all UTF-8 characters in metric and label names.
This means that names that were previously invalid and would have been previously rejected will be allowed through.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
@ywwg ywwg force-pushed the owilliams/default-on branch from 7b408ad to 88bb05c Compare September 6, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants