scrape: make `le` and `quantile` normalization configurable

### Proposal

This is a sister proposal for #12934. It's dealing with the same problem, but in a different way, which is not necessarily mutually exclusive, but if we implement the one, the need for the other might diminish.

## Problem

The `le` label of a classic histogram and the `quantile` label of a summary contain values that are representing float numbers (although they are, as it is always the case, technically strings). While PromQL sometimes parses them back as floats (namely in the `histogram_quantile` function), it generally treats them as strings and matches them on strings. This creates problems because one and the same float number can be represented as many different strings (see #12934 for more details). Here are the practically relevant issues:

- In Prom 1.x, all these label values were normalized (whether they came in via text format or protobuf format), using the usual Go formatting (`%g`). While I assume this can still yield unstable results in specific cases with many decimal places, this generally worked fine.
- Prom 2.x stopped normalizing these label values, and also dropped protobuf scraping. All these label values were taken "as is", i.e. as exposed by the instrumented target, thereby effectively importing the formatting convention of the language the target was written in. This created confusion when moving from Prom1 to Prom2 because histograms and summaries coming from Java or Python suddenly showed up differently in Prometheus.
- OpenMetrics tried to unify the formatting. While not ruling out all ambiguities for floats with many decimal places (see above), it mostly settled on a Python like formatting. This created confusion for Go binaries when switching them to OM.
- Prom 2.40 brought us native histograms and – as a byproduct – resurrected the protobuf scraping. Protobuf _has_ to normalize because the `le` and `quantile` values are _actual float numbers_ in the format, rather than strings. To stay consistent with OM, the new protobuf parser uses OM-style formatting. This created confusion when you have Go binaries that were not yet using the OM mode of prometheus/client_golang (which is a huge number, e.g. all of K8s, Prometheus itself and many other binaries from the Prometheus ecosystem, …) and switch to native histograms. The intent was to scrape classic histograms as before, but now all the `le` labels are OM formatted.

At the dev summit in September 2023 we decided to always normalize the values of `le` and `quantile` labels, presumably in OM style. While that will unify things for the future, we still have the problem at hand that a migration from classic to native histograms cannot be really smooth because the `le` labels will change once you set the native histogram feature flag.

## Proposal

In short, make the normalization of the `le` and `quantile` label values configurable in the scrape config. 

The setting needs a good name. `normalize_histogram_and_quantile_labels` is probably too long, but you know what I mean…

This new scrape config setting should apply to scrapes in all formats, thereby fulfilling the dev-summit feature demand. However, by keeping things configurable, users can migrate (or not migrate) from one format to another under their own control.

The settings could be `classic` to normalize in the "classic" Prometheus style, i.e. vanilla Go `%g` formatting, and `openmetrics` to normalize in OpenMetrics style, and `none` to take the values as they are from the text format.

The _default_ value of the setting would be to `none` for now, i.e. keep everything as is. In Prom 3.x, we could set the default to `openmetrics`.

There is a little wrinkle for protobuf because there is no `none`. Protobuf always normalizes for the reasons explained above. So the default behavior with protobuf format would effectively be `openmetrics` (if we keep things as they are now). However, the setting would still help, because you could set it to `classic` for Go targets to ease the migration. (Since we cannot know what the exposition would have looked like in text format if we scrape with protobuf, there is no easy way to automatically "do the right thing".)

## Alternatives

We could also provide a relabel action to normalize labels in one way or the other. It could also be applied to other labels that happen to be numbers. While the relabel action could also allow for smooth migration, it would be very tedious to manually identify all classic histograms and summaries and relabel them. Arguably, a scrape config and a relabel action could both exist for different use cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

scrape: make `le` and `quantile` normalization configurable #12984

Proposal

Problem

Proposal

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

scrape: make le and quantile normalization configurable #12984

Description

Proposal

Problem

Proposal

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

scrape: make `le` and `quantile` normalization configurable #12984