Skip to content

Conversation

ArthurSens
Copy link
Member

Superseeding #16014 by @ywwg

This PR adds a translation option for the OTLP endpoint, where UTF-8 characters aren't translated into underscores, AND unit and type suffixes aren't added.

The wording is very explicit that this mode is EXPERIMENTAL, meaning it's unsafe to enable in a production environment where OTel metrics with the same name but different types/units are present.

The plan for this translation mode is to reword the documentation to state that it should be used together with the feature flag type-and-unit-labels once the proposal prometheus/proposals#39 is implemented in #16228.

The wording "experimental" will only be removed once the feature flag is considered stable.

Copy link
Member

@ywwg ywwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the default strategy if nothing is set in the config? for now it should be escaping+suffixes, right?

thank you for pulling this together

@ArthurSens
Copy link
Member Author

correct!

bwplotka added a commit to prometheus/docs that referenced this pull request Apr 17, 2025
…context.

### Notable changes

* Cleaned up should/must/may to upper case for clarity (as per RFC style we do in specs).
* Mentioned UTF-8 support.
* Lifted limited character set from MUST to SHOULD with a clear about consequences and downsides.
* Mentioned alternative notations.
* Removed experimental mentions about native histograms, made it part of the main sample definition.
* Added section on WHY Prometheus metric SHOULD contain type and unit suffix, even with our type and unit labels one day.

### Motivation

* [OpenMetrics WG 2.0 Action item](https://docs.google.com/document/d/1FCD-38Xz1-9b3ExgHOeDTQUKUatzgj5KbCND9t-abZY/edit?tab=t.lvx6fags1fga#heading=h.3xjg8jcpctk4
) and blocker to relax MUST to SHOULD type and unit suffix in OpenMetrics 2.0.
* Similarly, to have something clear to reference to highlight huge downsides of skipping suffixes for OTLP (prometheus/prometheus#16441)
* To help with explaining this problem to Prometheus users in general (on recent KubeCons we had to explain this hundrends of times).


Signed-off-by: bwplotka <bwplotka@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 💪🏽

Approving as we already had a DevSummit consensus for this around the suffix discussion in OM.

I suggested a different wording and referencing Prometheus docs (plus prometheus/docs#2626), but otherwise LGTM!

@aknuds1
Copy link
Contributor

aknuds1 commented Apr 17, 2025

Taking a look.

@aknuds1 aknuds1 requested a review from Copilot April 17, 2025 15:11
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

storage/remote/write_handler_test.go:859

  • [nitpick] Ensure that labels.Equal correctly handles cases where the order of label declarations might differ. If the order is not guaranteed, consider using or documenting an order-insensitive comparison to prevent potential flaky tests.
if labels.Equal(expected.l, got.l) && expected.t == got.t && expected.v == got.v {

config/config_test.go:1700

  • [nitpick] Consider adding a test case for an unset or default OTLPConfig.TranslationStrategy to verify that the default behavior aligns with expectations. This will help prevent regressions if configuration defaults change in the future.
t.Run("incompatible config - NoUTF8EscapingWithSuffixes", func(t *testing.T) {

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, but please see comments.

@ArthurSens ArthurSens requested review from aknuds1, ywwg and bwplotka April 17, 2025 18:57
@ArthurSens ArthurSens force-pushed the otlp-no-translation branch from 36fa33c to 12a03b6 Compare April 17, 2025 19:23
@aknuds1 aknuds1 requested a review from Copilot April 18, 2025 07:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an experimental OTLP translation mode ("NoTranslation") that bypasses character translation and suffix appending, along with necessary test and configuration updates.

  • New test cases covering three translation strategies have been added in write_test.go.
  • The OTLPWriteHandler and configuration validation logic have been updated to account for the new "NoTranslation" mode.
  • Corresponding configuration files and tests have been updated to validate the expected error messages and behaviors.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
storage/remote/write_test.go Added test cases for NoTranslation and alternative translation strategies, and updated expected metric names.
storage/remote/write_handler.go Updated converter options based on the selected translation strategy.
config/testdata/otlp_no_translation.incompatible.yml Added a sample config that should trigger an error with legacy metric validation.
config/testdata/otlp_no_translation.good.yml Added a sample config that uses the new NoTranslation mode successfully.
config/config_test.go Extended tests for OTLP translation strategy error messages and configuration loading.
config/config.go Modified configuration validation to treat NoTranslation similar to NoUTF8EscapingWithSuffixes under legacy validation.
Comments suppressed due to low confidence (2)

storage/remote/write_handler.go:580

  • Double-check that disabling metric suffixes in 'NoTranslation' mode while enabling UTF8 (via the AllowUTF8 setting in the next line) aligns with the intended behavior; clearer documentation here may help avoid future confusion.
AddMetricSuffixes: otlpCfg.TranslationStrategy != config.NoTranslation,

config/config_test.go:1704

  • [nitpick] Ensure the updated error message format (with quotes around the translation strategy) is consistently documented in related documentation or release notes.
require.ErrorContains(t, err, `OTLP translation strategy "NoUTF8EscapingWithSuffixes" is not allowed when UTF8 is disabled`)

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a [FEATURE] entry to the changelog? The PR adds a significant feature after all.

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. I still wonder whether the PR should include a changelog item, to make the release shepherd's life easier, but no blocker.

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
@ArthurSens ArthurSens force-pushed the otlp-no-translation branch from 12a03b6 to 08c40f3 Compare April 22, 2025 13:30
@ArthurSens ArthurSens merged commit d9c0ad1 into main Apr 22, 2025
45 checks passed
@ArthurSens ArthurSens deleted the otlp-no-translation branch April 22, 2025 16:13
@jmichalek132
Copy link
Contributor

Finally 🎉

bwplotka added a commit to prometheus/docs that referenced this pull request Apr 24, 2025
… with the latest context (#2626)

* naming: Update metric and label name recommendations with the latest context.

### Notable changes

* Cleaned up should/must/may to upper case for clarity (as per RFC style we do in specs).
* Mentioned UTF-8 support.
* Lifted limited character set from MUST to SHOULD with a clear about consequences and downsides.
* Mentioned alternative notations.
* Removed experimental mentions about native histograms, made it part of the main sample definition.
* Added section on WHY Prometheus metric SHOULD contain type and unit suffix, even with our type and unit labels one day.

### Motivation

* [OpenMetrics WG 2.0 Action item](https://docs.google.com/document/d/1FCD-38Xz1-9b3ExgHOeDTQUKUatzgj5KbCND9t-abZY/edit?tab=t.lvx6fags1fga#heading=h.3xjg8jcpctk4
) and blocker to relax MUST to SHOULD type and unit suffix in OpenMetrics 2.0.
* Similarly, to have something clear to reference to highlight huge downsides of skipping suffixes for OTLP (prometheus/prometheus#16441)
* To help with explaining this problem to Prometheus users in general (on recent KubeCons we had to explain this hundrends of times).


Signed-off-by: bwplotka <bwplotka@gmail.com>

* Apply suggestions from code review

Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Address comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants