Skip to content

Conversation

agourlay
Copy link
Member

@agourlay agourlay commented Jun 26, 2025

The initial goal of this PR was to fix an over measurement of payload_index_io_read for the Text index.

In practice, the final fix is a performance optimization decreasing the hardware counter as a side effect.

In my tests, the payload_index_io_read value for Text matching is almost 3x higher than what is reported by the read_bytes of the process.

In case of high cardinality Text matching, the search is not driven directly by the payload index.
Meaning a potentially large number of points need to be tested against the payload index to check if the condition matches.

The current approach in dev lazily recomputes the set of points which match the query tokens for each point repeatedly.
This set is a constant for all points tested.

This inflates the number of payload_index_io_read of ops artificially as the mmap slice is cached.

The proposed fix is to pre-calculate the intersection of points for the query tokens to reuse it for each point test.

I was able to show locally that this change:

  • fixes the payload_index_io_read to be much closer to read_bytes
  • makes the query slightly faster

The trade-off is to use more memory once to avoid repeated IO.

The set of points kept in memory is the intersection of the tokens posting lists, therefore it should not grow out of proportion when adding tokens to the query.

Future work

Phrase matching does not leverage the optimization yet.
Hopefully we get to follow a similar patterns where positions are handled properly.

@agourlay agourlay force-pushed the optimize-text-payload-matcher branch from c4f4006 to 16ae35a Compare June 27, 2025 09:09
@agourlay agourlay marked this pull request as ready for review June 27, 2025 15:18
@agourlay agourlay requested review from coszio and timvisee June 27, 2025 15:18

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@agourlay agourlay force-pushed the optimize-text-payload-matcher branch from 8ca10ad to 3a99f45 Compare June 30, 2025 19:22
@agourlay agourlay requested a review from coszio June 30, 2025 19:44
@agourlay agourlay force-pushed the optimize-text-payload-matcher branch from 3a99f45 to 4f3e5ce Compare July 1, 2025 08:50
Copy link
Member

@generall generall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks are controversial

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your comment you mention we lazily compute a list of points 3 times. Could you point to the exact place where we do this? That is not immediately clear to me from this PR.

Comment on lines +271 to 278
// Build the set intersection of point ids from the query tokens
let points_for_token = full_text_index
.filter_query(parsed_query.clone(), &hw_counter)
.collect::<AHashSet<_>>();

Some(Box::new(move |point_id: PointOffsetType| {
full_text_index.check_match(&parsed_query, point_id, &hw_counter)
points_for_token.contains(&point_id)
}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question: if we can do this here, can't we just do the same inside full_text_index.check_match(..)?

@agourlay agourlay force-pushed the optimize-text-payload-matcher branch from 1b0c81c to fced9bb Compare July 7, 2025 13:06
@agourlay
Copy link
Member Author

agourlay commented Jul 8, 2025

The optimization is working fine when a lot of points are being tested in the matcher.

Otherwise, the cost of of preparing the intersection up front could be more expensive than the savings.

To make the matter worse, we actually build the filter_context more than once per query.
e.g:

Creating a single filter_context would improve the performance and maybe make the tradeoff of this PR much more of a win.

But making this change is not trivial so I will close the current approach as it will never work as long as filter_context are assumed to be cheap objects.

@agourlay agourlay closed this Jul 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants