-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Description
A target fully supporting OpenMetrics will expose ..._created
lines for each metric child. The way Prometheus currently ingests OpenMetrics, each of those will create an additional time series. The value is (mostly) constant and will therefore compress well. However, the number of time series will (almost) double, which is a significant bump in resource usage. This is in harsh contrast to the almost non-existent usefulness of those time series in the current Prometheus context.
Prometheus negotiates OpenMetrics by default (and that behavior cannot be disabled). Thus, with each target starting to support OpenMetrics, the number of time series will increase. This will come as a surprise to the operator of the Prometheus server if the OpenMetrics support is just a side effect of upgrading to a new version of a 3rd-party-supported target. For example, once K8s components expose OpenMetrics including the ..._created
, a routine K8s upgrade will suddenly start to expose all those ..._created
lines.
While technically, a metrics change is something operators should take into account when upgrading targets, I expect a lot of confusion and surprise and even monitoring outages if we don't handle the change in a more robust way. See prometheus/client_python#438 for reactions when the Python client started to support OpenMetrics.
The currently recommended action is probably to add metric_relabel_configs
to drop all metrics with a name ending on _created
. This has a number of issues:
- It is opt-in, and it is so in a very non-obvious way. Even if we advertise this practice aggressively, I'd expect many operators to not notice.
- It will drop all metrics with a name ending on
_created
, even those that are not auto-created but regular metrics. - It has a more or less significant performance impact.
I propose to handle the ..._created
lines in a different way.
The minimal option would be to automatically ignore those lines if all the following is true:
- It's an OpenMetrics exposition.
- The
..._created
line has a corresponding "proper" metric (e.g.foobar_created
goes along withfoobar
orfoobar_total
orfoobar_sum
/foobar_count
etc).
The above will avoid dropping regular metrics that happen to have a name ending on _created
.
However, perhaps we can do even better and actually make use of the ..._creaed
lines: If the creation timestamp passes certain sanity checks (earlier than the scrape time, but not too much), we can artificially insert "zero" samples for counters, histograms, and summaries that spring into existence with a value greater than zero. This would finally provide a solution to the long-standing problem described in #1673 .
Metadata
Metadata
Assignees
Type
Projects
Status