MQE: add support for `topk` and `bottomk` #10614

charleskorn · 2025-02-10T04:31:09Z

What this PR does

This PR adds support for topk and bottomk to MQE.

Note that I have not added topk and bottomk to the mixed metrics gauntlet as there are many false-positive failures: topk and bottomk do not guarantee which series will be returned when multiple series in a group have the same value, and due to differences in Prometheus' engine and MQE, they each select different series in these cases.

Compared to Prometheus' engine, MQE is up to 30% faster and uses up to 70% less memory at peak for non-trivial benchmarks:

                                                               │  Prometheus  │               Mimir                │
                                                               │    sec/op    │   sec/op     vs base               │
Query/topk(1,_a_1),_instant_query-10                              147.3µ ± 5%   144.6µ ± 1%   -1.83% (p=0.026 n=6)
Query/topk(1,_a_1),_range_query_with_100_steps-10                 175.5µ ± 1%   157.6µ ± 2%  -10.20% (p=0.002 n=6)
Query/topk(1,_a_1),_range_query_with_1000_steps-10                462.8µ ± 2%   280.9µ ± 0%  -39.31% (p=0.002 n=6)
Query/topk(1,_a_100),_instant_query-10                            807.0µ ± 2%   776.8µ ± 1%   -3.75% (p=0.002 n=6)
Query/topk(1,_a_100),_range_query_with_100_steps-10               1.388m ± 0%   1.378m ± 1%   -0.70% (p=0.015 n=6)
Query/topk(1,_a_100),_range_query_with_1000_steps-10              6.246m ± 1%   6.294m ± 0%        ~ (p=0.065 n=6)
Query/topk(1,_a_2000),_instant_query-10                           10.89m ± 2%   10.40m ± 1%   -4.53% (p=0.002 n=6)
Query/topk(1,_a_2000),_range_query_with_100_steps-10              21.03m ± 1%   20.83m ± 1%        ~ (p=0.065 n=6)
Query/topk(1,_a_2000),_range_query_with_1000_steps-10             134.6m ± 1%   110.3m ± 0%  -18.05% (p=0.002 n=6)
Query/topk(5,_a_1),_instant_query-10                              146.2µ ± 1%   140.8µ ± 2%   -3.69% (p=0.002 n=6)
Query/topk(5,_a_1),_range_query_with_100_steps-10                 175.7µ ± 1%   157.4µ ± 1%  -10.44% (p=0.002 n=6)
Query/topk(5,_a_1),_range_query_with_1000_steps-10                462.4µ ± 3%   279.4µ ± 1%  -39.57% (p=0.002 n=6)
Query/topk(5,_a_100),_instant_query-10                            811.7µ ± 1%   783.2µ ± 1%   -3.51% (p=0.002 n=6)
Query/topk(5,_a_100),_range_query_with_100_steps-10               1.626m ± 3%   1.576m ± 1%   -3.03% (p=0.002 n=6)
Query/topk(5,_a_100),_range_query_with_1000_steps-10              8.576m ± 1%   8.296m ± 6%        ~ (p=0.065 n=6)
Query/topk(5,_a_2000),_instant_query-10                           10.89m ± 1%   10.47m ± 1%   -3.85% (p=0.002 n=6)
Query/topk(5,_a_2000),_range_query_with_100_steps-10              23.21m ± 1%   23.14m ± 2%        ~ (p=0.240 n=6)
Query/topk(5,_a_2000),_range_query_with_1000_steps-10             179.0m ± 2%   132.7m ± 0%  -25.87% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_1),_instant_query-10                     193.4µ ± 2%   184.8µ ± 2%   -4.41% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_1),_range_query_with_100_steps-10        396.9µ ± 3%   261.8µ ± 1%  -34.04% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_1),_range_query_with_1000_steps-10      2016.5µ ± 3%   898.6µ ± 1%  -55.44% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_100),_instant_query-10                   3.844m ± 2%   3.676m ± 1%   -4.37% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_100),_range_query_with_100_steps-10      8.550m ± 1%   8.336m ± 2%   -2.50% (p=0.015 n=6)
Query/topk_by_(le)_(5,_h_100),_range_query_with_1000_steps-10     49.95m ± 2%   47.92m ± 0%   -4.08% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_2000),_instant_query-10                  68.97m ± 2%   66.73m ± 2%   -3.26% (p=0.004 n=6)
Query/topk_by_(le)_(5,_h_2000),_range_query_with_100_steps-10     157.3m ± 1%   145.7m ± 1%   -7.38% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_2000),_range_query_with_1000_steps-10   1169.4m ± 0%   795.4m ± 0%  -31.98% (p=0.002 n=6)
geomean                                                           4.156m        3.599m       -13.41%


                                                               │  Prometheus   │                Mimir                │
                                                               │       B       │      B        vs base               │
Query/topk(1,_a_1),_instant_query-10                              66.27Mi ± 1%   65.15Mi ± 1%   -1.70% (p=0.009 n=6)
Query/topk(1,_a_1),_range_query_with_100_steps-10                 64.23Mi ± 1%   64.72Mi ± 2%   +0.77% (p=0.041 n=6)
Query/topk(1,_a_1),_range_query_with_1000_steps-10                63.00Mi ± 1%   64.52Mi ± 1%   +2.42% (p=0.002 n=6)
Query/topk(1,_a_100),_instant_query-10                            61.68Mi ± 1%   60.53Mi ± 2%   -1.86% (p=0.009 n=6)
Query/topk(1,_a_100),_range_query_with_100_steps-10               62.03Mi ± 1%   60.78Mi ± 1%   -2.02% (p=0.004 n=6)
Query/topk(1,_a_100),_range_query_with_1000_steps-10              64.65Mi ± 1%   61.77Mi ± 1%   -4.46% (p=0.002 n=6)
Query/topk(1,_a_2000),_instant_query-10                           63.16Mi ± 2%   62.32Mi ± 1%   -1.34% (p=0.017 n=6)
Query/topk(1,_a_2000),_range_query_with_100_steps-10              70.54Mi ± 1%   63.90Mi ± 1%   -9.41% (p=0.002 n=6)
Query/topk(1,_a_2000),_range_query_with_1000_steps-10            128.05Mi ± 2%   68.60Mi ± 1%  -46.43% (p=0.002 n=6)
Query/topk(5,_a_1),_instant_query-10                              66.30Mi ± 1%   65.70Mi ± 1%        ~ (p=0.195 n=6)
Query/topk(5,_a_1),_range_query_with_100_steps-10                 64.67Mi ± 1%   65.26Mi ± 1%   +0.91% (p=0.026 n=6)
Query/topk(5,_a_1),_range_query_with_1000_steps-10                62.91Mi ± 1%   64.60Mi ± 1%   +2.69% (p=0.002 n=6)
Query/topk(5,_a_100),_instant_query-10                            61.24Mi ± 1%   61.05Mi ± 1%   -0.31% (p=0.006 n=6)
Query/topk(5,_a_100),_range_query_with_100_steps-10               62.09Mi ± 0%   60.99Mi ± 2%   -1.76% (p=0.002 n=6)
Query/topk(5,_a_100),_range_query_with_1000_steps-10              64.75Mi ± 1%   62.00Mi ± 1%   -4.25% (p=0.002 n=6)
Query/topk(5,_a_2000),_instant_query-10                           63.61Mi ± 1%   62.17Mi ± 1%   -2.26% (p=0.009 n=6)
Query/topk(5,_a_2000),_range_query_with_100_steps-10              71.10Mi ± 2%   63.48Mi ± 1%  -10.71% (p=0.002 n=6)
Query/topk(5,_a_2000),_range_query_with_1000_steps-10            126.81Mi ± 1%   68.02Mi ± 2%  -46.37% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_1),_instant_query-10                     64.33Mi ± 1%   63.59Mi ± 1%   -1.15% (p=0.045 n=6)
Query/topk_by_(le)_(5,_h_1),_range_query_with_100_steps-10        62.95Mi ± 1%   63.40Mi ± 2%        ~ (p=0.240 n=6)
Query/topk_by_(le)_(5,_h_1),_range_query_with_1000_steps-10       64.27Mi ± 1%   62.56Mi ± 2%   -2.66% (p=0.004 n=6)
Query/topk_by_(le)_(5,_h_100),_instant_query-10                   61.88Mi ± 1%   61.54Mi ± 1%        ~ (p=0.180 n=6)
Query/topk_by_(le)_(5,_h_100),_range_query_with_100_steps-10      65.20Mi ± 1%   63.95Mi ± 1%   -1.92% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_100),_range_query_with_1000_steps-10     83.88Mi ± 1%   67.52Mi ± 1%  -19.50% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_2000),_instant_query-10                  70.71Mi ± 1%   66.52Mi ± 1%   -5.93% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_2000),_range_query_with_100_steps-10    112.00Mi ± 1%   71.28Mi ± 1%  -36.36% (p=0.002 n=6)
Query/topk_by_(le)_(5,_h_2000),_range_query_with_1000_steps-10   265.25Mi ± 1%   75.28Mi ± 3%  -71.62% (p=0.002 n=6)
geomean                                                           73.73Mi        64.41Mi       -12.64%

Which issue(s) this PR fixes or relates to

#10067

Checklist

Tests updated.
[n/a] Documentation added.
[covered by Mimir Query Engine #10067] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
[n/a] about-versioning.md updated with experimental features.

…rting that they are unsupported

jhesketh

We should put this through the gauntlet, at least for instant queries to confirm the order.

We should also have a feature flag for this operator since it doesn't reuse most of the aggregation operator.

(I need to review range_query a bit closer still, but at first glance looks good)

pkg/streamingpromql/operators/aggregations/topkbottomk/topk_bottomk.go

pkg/streamingpromql/testdata/ours/aggregators.test

jhesketh · 2025-02-12T04:07:00Z

pkg/streamingpromql/operators/aggregations/aggregation.go

-// whereas this method does.
-// This saves us allocating a new buffer and string for every single input series, which has a noticeable performance impact.
-type seriesToGroupLabelsBytesFunc func(labels.Labels) []byte
+func (a *Aggregation) groupLabelsBytesFunc() SeriesToGroupLabelsBytesFunc {


I don't think this needs to be a method any more. We should be able to just use GroupLabelsBytesFunc on line 138 (same with the new operators too)

I've changed this in the new operators, but TestAggregation_GroupLabelling uses this method, so I'd prefer to keep it as-is in Aggregation.

(not blocking) Why not update those tests?

To me, it's more valuable to test the Aggregation operator itself, as it does some work to make these tests pass (eg. adding the __name__ label to the list of labels when without is used) - the fact it uses a shared function is more of an implementation detail.

pkg/streamingpromql/operators/aggregations/aggregation.go

pkg/streamingpromql/operators/aggregations/common.go

pkg/streamingpromql/operators/aggregations/topkbottomk/range_query.go

pkg/streamingpromql/operators/aggregations/topkbottomk/instant_query.go

pkg/streamingpromql/operators/aggregations/topkbottomk/range_query.go

pkg/streamingpromql/operators/aggregations/topkbottomk/range_query_test.go

charleskorn · 2025-02-13T01:47:20Z

We should put this through the gauntlet, at least for instant queries to confirm the order.

I did try this, but it would not be straightforward:

Note that I have not added topk and bottomk to the mixed metrics gauntlet as there are many false-positive failures: topk and bottomk do not guarantee which series will be returned when multiple series in a group have the same value, and due to differences in Prometheus' engine and MQE, they each select different series in these cases.

charleskorn · 2025-02-13T01:48:45Z

We should also have a feature flag for this operator since it doesn't reuse most of the aggregation operator.

Isn't this covered by -querier.mimir-query-engine.disabled-aggregations here?

…th `without`

jhesketh · 2025-02-14T04:11:07Z

We should put this through the gauntlet, at least for instant queries to confirm the order.

I did try this, but it would not be straightforward:

Note that I have not added topk and bottomk to the mixed metrics gauntlet as there are many false-positive failures: topk and bottomk do not guarantee which series will be returned when multiple series in a group have the same value, and due to differences in Prometheus' engine and MQE, they each select different series in these cases.

Sorry I missed that.

I wonder if we can create a similar gauntlet for these aggregations that only has deterministic combinations? It'd still be good to test more combinations such as stale markers and infinities etc.

jhesketh · 2025-02-14T04:11:50Z

We should also have a feature flag for this operator since it doesn't reuse most of the aggregation operator.

Isn't this covered by -querier.mimir-query-engine.disabled-aggregations here?

Heh, yep, that works.

pkg/streamingpromql/operators/aggregations/topkbottomk/instant_query.go

charleskorn · 2025-02-14T04:48:39Z

I wonder if we can create a similar gauntlet for these aggregations that only has deterministic combinations? It'd still be good to test more combinations such as stale markers and infinities etc.

topk and bottomk should never see stale markers, as they're filtered out by instant vector selectors. They also ignore histograms.

We already have some tests that cover the behaviour of topk and bottomk with infinity and NaN. I don't think the value of adding topk and bottomk to the existing gauntlet (or creating another similar test) outweighs the cost of building, maintaining and running them - the gauntlet tests are becoming slower and slower, and that has an impact on engineer productivity and CI run times.

jhesketh

Looking good, thanks! Just a last few things

pkg/streamingpromql/operators/aggregations/topkbottomk/range_query.go

jhesketh · 2025-02-14T04:47:51Z

pkg/streamingpromql/operators/aggregations/topkbottomk/range_query.go

+			return nil
+		}
+
+		if len(g.seriesForTimestamps[timestampIndex]) == int(limit) {


(nit) point selection/interacting with the heap could be split out into own method

I've moved the other methods out in 54bba6e, which should help make this a little shorter - how's this now?

I still feel this if/else block could be extracted so we have accumulateIntoGroup dealing with getting the series for the group, and a new method such as updateHeapForGroup or similar for dealing with the heap selection etc. (again, jsut a nit)

Done in #10679

jhesketh

lgtm, thanks for dealing with my feedback :-)

charleskorn · 2025-02-14T06:02:02Z

I'm going to merge this as-is so we can start testing it next week, but I will come back and address some of the refactoring as a follow-up PR.

* Enable upstream test cases * Enable benchmark * Add our own test cases * Initial implementation * Fix issue where queries that use `count_values` fail rather than reporting that they are unsupported * Add benchmark with grouping * Add tests for grouping and sorting * Reuse same heap instance to avoid allocating on every input sample. * Add shortcut for case where k=1 * Pool slices of `topKBottomKSeries` * Pool `[][]int` slices * Move to separate package * Initial implementation of sorting for instant queries * Fix sorting of NaN values for instant queries * Expand test cases to cover instant queries as well * Add license headers * Address PR feedback: don't bother allocating a new slice when used with `without` * Address PR feedback related to `GroupLabelsBytesFunc` * Address PR feedback: refactor `GroupLabelsBytesFunc` * Address PR feedback: add and clarify comments * Address PR feedback: rename `limit` to `k` * Rename `topKBottomKSeries` to `rangeQuerySeries` * Add pooling for `[]instantQuerySeries` * Address PR feedback: add further test cases with NaN and infinite values * Add test for instant query sorting and grouping * Address PR feedback: clarify comments * Address PR feedback: clarify error message * Address PR feedback: extract functions

charleskorn added 5 commits February 5, 2025 14:17

Enable upstream test cases

261d451

Enable benchmark

0a75e52

Add our own test cases

715537d

Initial implementation

8c55714

Fix issue where queries that use count_values fail rather than repo…

dd619c1

…rting that they are unsupported

charleskorn mentioned this pull request Feb 10, 2025

Mimir Query Engine #10067

Open

charleskorn added 2 commits February 10, 2025 15:44

Add benchmark with grouping

7489a02

Add tests for grouping and sorting

a6aa656

charleskorn force-pushed the charleskorn/mqe-topk-bottomk branch from 858b84a to a6aa656 Compare February 10, 2025 05:43

charleskorn added 9 commits February 10, 2025 16:47

Reuse same heap instance to avoid allocating on every input sample.

69f7ed5

Add shortcut for case where k=1

df2c536

Pool slices of topKBottomKSeries

e5e74b9

Pool [][]int slices

7e75db7

Move to separate package

5040911

Initial implementation of sorting for instant queries

063da14

Fix sorting of NaN values for instant queries

82753b4

Expand test cases to cover instant queries as well

caf9e2a

Add license headers

5b54c0e

charleskorn marked this pull request as ready for review February 12, 2025 03:14

charleskorn requested a review from a team as a code owner February 12, 2025 03:14

jhesketh reviewed Feb 12, 2025

View reviewed changes

charleskorn added 7 commits February 13, 2025 12:51

Address PR feedback: don't bother allocating a new slice when used wi…

d37816a

…th `without`

Address PR feedback related to GroupLabelsBytesFunc

1cff08f

Address PR feedback: refactor GroupLabelsBytesFunc

eb3e3eb

Address PR feedback: add and clarify comments

fdeb024

Address PR feedback: rename limit to k

d319711

Rename topKBottomKSeries to rangeQuerySeries

2ccd084

Add pooling for []instantQuerySeries

f49c2b8

charleskorn force-pushed the charleskorn/mqe-topk-bottomk branch from 62f545c to f49c2b8 Compare February 13, 2025 02:47

charleskorn added 2 commits February 13, 2025 13:57

Address PR feedback: add further test cases with NaN and infinite values

e05968a

Add test for instant query sorting and grouping

c2ba6e6

charleskorn requested a review from jhesketh February 13, 2025 03:42

jhesketh reviewed Feb 14, 2025

View reviewed changes

pkg/streamingpromql/operators/aggregations/topkbottomk/instant_query.go Outdated Show resolved Hide resolved

charleskorn added 2 commits February 14, 2025 15:33

Address PR feedback: clarify comments

056f1d9

Address PR feedback: clarify error message

1c856c4

jhesketh reviewed Feb 14, 2025

View reviewed changes

Address PR feedback: extract functions

54bba6e

jhesketh approved these changes Feb 14, 2025

View reviewed changes

charleskorn merged commit 33b6dc5 into main Feb 14, 2025
28 checks passed

charleskorn deleted the charleskorn/mqe-topk-bottomk branch February 14, 2025 06:02

charleskorn mentioned this pull request Feb 18, 2025

MQE: address PR feedback on topk / bottomk #10679

Merged

MQE: add support for topk and bottomk #10614

MQE: add support for topk and bottomk #10614

Uh oh!

Conversation

charleskorn commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

Uh oh!

jhesketh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jhesketh Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

charleskorn Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

jhesketh Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

charleskorn Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

charleskorn commented Feb 13, 2025

Uh oh!

charleskorn commented Feb 13, 2025

Uh oh!

jhesketh commented Feb 14, 2025

Uh oh!

jhesketh commented Feb 14, 2025

Uh oh!

Uh oh!

charleskorn commented Feb 14, 2025

Uh oh!

jhesketh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jhesketh Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

charleskorn Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

jhesketh Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charleskorn Feb 18, 2025

Choose a reason for hiding this comment

Uh oh!

jhesketh left a comment

Choose a reason for hiding this comment

Uh oh!

charleskorn commented Feb 14, 2025

Uh oh!

Uh oh!

Uh oh!

MQE: add support for `topk` and `bottomk` #10614

MQE: add support for `topk` and `bottomk` #10614

charleskorn commented Feb 10, 2025 •

edited

Loading

jhesketh Feb 14, 2025 •

edited

Loading