Skip to content

Conversation

cutecutecat
Copy link
Member

@cutecutecat cutecutecat commented May 12, 2025

Feat

  • Enable rerank_in_table=true with read_stream(pg17) and prefetch_buffer(not pg17)

Optimization

  • Remove redundant matrix multiply in both heap / non-heap query/insert, accerate most searches

Chore

  • Rename vchordrq.prererank_filtering into vchordrq.prefilter

Test

  • Add specialized tests for pg16(prefetch_buffer) and pg17(read_stream)

@cutecutecat cutecutecat force-pushed the prefilter branch 4 times, most recently from 86a804b to 9edd281 Compare May 12, 2025 07:07
@cutecutecat cutecutecat requested a review from Copilot May 12, 2025 07:27
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for prefiltering by renaming the configuration option from prererank_filtering to prefilter and updating related code throughout the codebase. It also optimizes query/insert operations by eliminating redundant matrix multiplications and refactors insertion logic to use new helper functions.

  • Renames the configuration setting and updates its usage across modules.
  • Adds a new filter method to the SearchFetcher trait and updates its implementations.
  • Refactors vector insertion logic and adjusts test workflows to cover pg16 and pg17 features.

Reviewed Changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/general/*.slt Added DROP TABLE cleanup commands in test setups
src/index/scanners/mod.rs Adds new filter method to SearchFetcher trait
src/index/scanners/maxsim.rs Refactors vector handling with clearer naming and cloning
src/index/scanners/default.rs Introduces rerank_wrapper and uses seq_filter for prefilter
src/index/gucs.rs, src/index/am/mod.rs Adjusts configuration accesses from prererank_filtering to prefilter
src/index/algorithm.rs, crates/algorithm/* Updates insertion logic to use new helper functions, improves prefetcher traits
.github/workflows/check.yml Updates test file patterns and adds version-specific test jobs
Comments suppressed due to low confidence (2)

.github/workflows/check.yml:176

  • [nitpick] Please verify that the updated test file glob pattern covers all intended test cases and that no tests are unintentionally omitted from the CI runs.
sqllogictest --db $USER --user $USER './tests/general/*.slt'

src/index/scanners/default.rs:133

  • Using BinaryHeap::from(results) for filtering could incur additional overhead if the results are large; verify that this conversion meets performance targets under heavy load.
let seq = seq_filter(BinaryHeap::from(results), |key| {

@cutecutecat cutecutecat marked this pull request as ready for review May 12, 2025 07:57
@cutecutecat cutecutecat requested a review from usamoi May 12, 2025 07:59
@cutecutecat cutecutecat force-pushed the prefilter branch 16 times, most recently from 73484de to a6f5179 Compare May 13, 2025 01:00
@cutecutecat cutecutecat requested a review from usamoi May 13, 2025 01:15
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
@cutecutecat cutecutecat force-pushed the prefilter branch 9 times, most recently from fa56eae to 7ee8814 Compare May 13, 2025 14:48
@cutecutecat cutecutecat requested a review from usamoi May 14, 2025 01:33
))
}
// 强制prefilter 检查还会不会出错
// TODO 记得改回来
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

记得改回来

Copy link
Member Author

@cutecutecat cutecutecat May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
@cutecutecat cutecutecat merged commit b7c6a7f into tensorchord:main May 14, 2025
15 checks passed
@cutecutecat cutecutecat deleted the prefilter branch May 14, 2025 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants