Skip to content

Conversation

krajorama
Copy link
Member

tsdb: fix inconsistent histogram chunk iterators

The PromQL engine has an implicit requirement that chunk iterators return copies of native histograms. For example here.

Some iterators do not follow this, so this PR updates the interface definition to be more explicit and fixes some iterators.

Caught when go test -race failed on TestInstantQuerySplittingCorrectness in Mimir. Here we are seeding multiple engines with the same series, so reading through storageSeriesIterator put the histogram pointers into two separate pools for reuse here.

The interface definition wasn't strict enough of what we expect of the
iterators.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
In particular storageSeriesIterator was causing an issue when the engine
would read without copy from it here:
https://github.com/prometheus/prometheus/blob/4408f898bc592b570aa0a173c228e4fe2f284474/promql/engine.go#L2203

And then put that pointer into a pool and reuse.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama marked this pull request as draft February 28, 2024 09:24
Only makes sense when the iterator works on data that stores pointers
directly.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@krajorama krajorama marked this pull request as ready for review February 28, 2024 10:06
@krajorama krajorama requested a review from beorn7 February 28, 2024 10:07
Copy link
Contributor

@duricanikolic duricanikolic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ensure that all implementations of the Iterator interface have been updated?

// histogram with integer counts. Before the iterator has advanced, the behaviour
// is unspecified.
// The method accepts an optional Histogram object which will be
// The method accepts an optional Histogram object which may be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t your PR enforce that the optional Histogram object will be reused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not checking if the value is reused and also I see some examples of especially mock iterators that always return new objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is then ambiguous. As a caller of it.AtHistogram(h), I want to know whether the object corresponding to the pointer I passed has actually been reused or not, without having to go through all the implementations of the interface.
I think that we should use will as before, even because this way we kind of force new implementations to respect the contract: if they don't reuse the given object, and if something bad happens, then it is their fault.
Related to the mocked iterators, I don't know if they are so relevant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. I can also just update the mocks as well to be sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there's one non mock that needs rework then: func (c *concreteSeriesIterator) AtFloatHistogram(*histogram.FloatHistogram) (int64, *histogram.FloatHistogram) {

// value is a histogram with floating-point counts. It also works if the
// value is a histogram with integer counts, in which case a
// FloatHistogram copy of the histogram is returned. Before the iterator
// has advanced, the behaviour is unspecified.
// The method accepts an optional FloatHistogram object which will be
// The method accepts an optional FloatHistogram object which may be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t your PR enforce that the optional FloatHistogram object will be reused?

@krajorama
Copy link
Member Author

Did you ensure that all implementations of the Iterator interface have been updated?

Yes, I've reviewed all, but could use a second pair of eyes for sure.

Otherwise there's a cyclic dependency via tsdbutil

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
From comment by @zenador

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@duricanikolic
Copy link
Contributor

Did you ensure that all implementations of the Iterator interface have been updated?

Yes, I've reviewed all, but could use a second pair of eyes for sure.

I have double-checked it, and I confirm that all implementations of AtHistogram() and AtFloatHistogram() have been considered.

@krajorama
Copy link
Member Author

krajorama commented Feb 28, 2024

Note : does not affect web api performance because there the histograms are directly accessed

jsonutil.MarshalHistogram(p.H, stream)

Copy link
Contributor

@duricanikolic duricanikolic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you

@krajorama krajorama marked this pull request as draft February 28, 2024 14:51
@krajorama
Copy link
Member Author

We've discussed in IRL and this goes deep into the rabbit hole of iterator optimizations, we'll do a different fix for the problem that is explained in the description.

@krajorama
Copy link
Member Author

Closing until we figure out which direction to go in.

@krajorama krajorama closed this Feb 29, 2024
@krajorama krajorama deleted the krajo/fix-inconsistent-chunk-iterators branch March 1, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants