-
Notifications
You must be signed in to change notification settings - Fork 35
feat: support prefilter #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
86a804b
to
9edd281
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for prefiltering by renaming the configuration option from prererank_filtering to prefilter and updating related code throughout the codebase. It also optimizes query/insert operations by eliminating redundant matrix multiplications and refactors insertion logic to use new helper functions.
- Renames the configuration setting and updates its usage across modules.
- Adds a new filter method to the SearchFetcher trait and updates its implementations.
- Refactors vector insertion logic and adjusts test workflows to cover pg16 and pg17 features.
Reviewed Changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
tests/general/*.slt | Added DROP TABLE cleanup commands in test setups |
src/index/scanners/mod.rs | Adds new filter method to SearchFetcher trait |
src/index/scanners/maxsim.rs | Refactors vector handling with clearer naming and cloning |
src/index/scanners/default.rs | Introduces rerank_wrapper and uses seq_filter for prefilter |
src/index/gucs.rs, src/index/am/mod.rs | Adjusts configuration accesses from prererank_filtering to prefilter |
src/index/algorithm.rs, crates/algorithm/* | Updates insertion logic to use new helper functions, improves prefetcher traits |
.github/workflows/check.yml | Updates test file patterns and adds version-specific test jobs |
Comments suppressed due to low confidence (2)
.github/workflows/check.yml:176
- [nitpick] Please verify that the updated test file glob pattern covers all intended test cases and that no tests are unintentionally omitted from the CI runs.
sqllogictest --db $USER --user $USER './tests/general/*.slt'
src/index/scanners/default.rs:133
- Using BinaryHeap::from(results) for filtering could incur additional overhead if the results are large; verify that this conversion meets performance targets under heavy load.
let seq = seq_filter(BinaryHeap::from(results), |key| {
73484de
to
a6f5179
Compare
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
fa56eae
to
7ee8814
Compare
src/index/scanners/default.rs
Outdated
)) | ||
} | ||
// 强制prefilter 检查还会不会出错 | ||
// TODO 记得改回来 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
记得改回来
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
Feat
rerank_in_table=true
withread_stream
(pg17) andprefetch_buffer
(not pg17)Optimization
Chore
vchordrq.prererank_filtering
intovchordrq.prefilter
Test