Skip to content

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Jun 11, 2025

What this PR does

This PR fixes an issue where queriers could panic with estimated memory consumption of this query is negative while evaluating a query containing a one-to-many or many-to-one binary operation (aka group_left / group_right).

Specifically, the issue is triggered if all of the series with the same group matching labels on the "one" side are read and then the operator is closed before all of the corresponding output series have been emitted. (This can happen if, for example, the query contains further binary operations that only need a subset of the output series.)

In this case, the matchGroup's presence slice would be returned to the pool multiple times: once in updateOneSidePresence, and again when oneSide.Close was called by the first remaining output series in that group.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • [covered by Mimir Query Engine #10067] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

@charleskorn charleskorn marked this pull request as ready for review June 11, 2025 01:09
@charleskorn charleskorn requested a review from a team as a code owner June 11, 2025 01:09
Copy link
Contributor

@lamida lamida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks reasonable and tests seem covering all the edge cases now.

Follow up comment that can be done on the separate PR.

All instances of types.SomeSlicePool.Put(slice, ..) is always followed by setting slice = nil. Should we nil the slice inside the Put instead to avoid missed setting that up like this?

@charleskorn charleskorn merged commit db7d12b into main Jun 11, 2025
30 checks passed
@charleskorn charleskorn deleted the charleskorn/mqe-negative-memory-consumption branch June 11, 2025 04:27
mimir-github-bot bot pushed a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)
mimir-github-bot bot pushed a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)
mimir-github-bot bot pushed a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)
mimir-github-bot bot pushed a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)
mimir-github-bot bot pushed a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)
@charleskorn
Copy link
Contributor Author

All instances of types.SomeSlicePool.Put(slice, ..) is always followed by setting slice = nil. Should we nil the slice inside the Put instead to avoid missed setting that up like this?

We can't nil the slice inside the Put, but we might be able to do something if Put returns nil - if there's a linting rule to ensure that return values aren't ignored, then this would help catch issues like these. However, last time I looked into this, such a linter didn't exist.

charleskorn added a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690) (#11692)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
charleskorn added a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690) (#11695)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
charleskorn added a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690) (#11693)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
charleskorn added a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690) (#11696)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
charleskorn added a commit that referenced this pull request Jun 11, 2025
…ft` or `group_right` (#11690) (#11694)

* Fix test names

* Test case where operator is closed before any series are read

* Fix issue where a query containing `group_left` or `group_right` could panic if the binary operator is closed early

(cherry picked from commit db7d12b)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
@bboreham
Copy link
Contributor

Another way to go would be to have Put take a pointer to the slice, then you could overwrite it.

This would be a bit more ugly in the code, but perhaps easier to use?

charleskorn added a commit that referenced this pull request Jun 29, 2025
#### What this PR does

This PR is an alternative to #11741, which implements an idea from
@bboreham
[here](#11690 (comment)).

Compared to #11741, this avoids a need for a linter, at the cost of a
slightly less obvious behaviour and slightly uglier code.

#### Which issue(s) this PR fixes or relates to

#11690

#### Checklist

- [x] Tests updated.
- [n/a] Documentation added.
- [n/a] `CHANGELOG.md` updated - the order of entries should be
`[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry
is not needed, please add the `changelog-not-needed` label to the PR.
- [n/a]
[`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md)
updated with experimental features.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants