Skip to content

Inconsistent vector cannot contain metrics with the same labelset errors for functions over range vectors #14695

@charleskorn

Description

@charleskorn

What did you do?

Running a query like max_over_time({__name__=~"metric_.*"}) produces inconsistent results when run at individual steps rather than a single range query that evaluates at the same steps.

I've summarised the issue with a test case in promqltest syntax:

load 6m
  metric_1{common="label"} 0 1 _ _ 4 5
  metric_2{common="label"} _ _ 2 3 _ 6

# No conflicts, should merge series into one output series.
#
# This succeeds.
eval range from 0 to 24m step 6m ceil({__name__=~"metric_.*"})
  {common="label"} 0 1 2 3 4

# Same as above, but with conflict at T=30m.
#
# This succeeds (ie. it returns the expected error message).
eval_fail range from 0 to 30m step 6m ceil({__name__=~"metric_.*"})
  expected_fail_message vector cannot contain metrics with the same labelset

# Same two cases as above, but with a function that takes a range vector.
#
# All of these single step range queries succeed. Range queries that only select metric_1 or metric_2 (eg. 0 to 6m, or 12m to 18m) also succeed.
eval range from 0 to 0 step 1m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 0

eval range from 6m to 6m step 1m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 1

eval range from 12m to 12m step 1m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 2

eval range from 18m to 18m step 1m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 3

eval range from 24m to 24m step 1m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 4

# This is the problematic case: 
# 
# This range query takes in all of the above steps fails with "vector cannot contain metrics with the same labelset"
eval range from 0 to 24m step 6m max_over_time({__name__=~"metric_.*"}[5m])
  {common="label"} 0 1 2 3 4

# This succeeds (ie. it returns the expected error message).
eval_fail range from 0 to 30m step 6m max_over_time({__name__=~"metric_.*"}[5m])
  expected_fail_message vector cannot contain metrics with the same labelset

(I've used eval range throughout as eval instant runs into a legitimate instance of vector cannot contain metrics with the same labelset when it runs a range query equivalent of the expression.)

What did you expect to see?

All test cases behave as expected, ie. are consistent regardless of the time range queried.

What did you see instead? Under which circumstances?

The eval range from 0 to 24m step 6m max_over_time({__name__=~"metric_.*"}[5m]) scenario fails with vector cannot contain metrics with the same labelset.

System information

No response

Prometheus version

No response

Prometheus configuration file

No response

Alertmanager version

No response

Alertmanager configuration file

No response

Logs

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions