Skip to content

Conversation

usamoi
Copy link
Contributor

@usamoi usamoi commented Feb 21, 2025

based on 0.2.2 release

@usamoi usamoi marked this pull request as draft February 21, 2025 06:55
@usamoi usamoi force-pushed the maxsim branch 7 times, most recently from 1299f7f to c236ff5 Compare February 24, 2025 09:11
@usamoi usamoi force-pushed the maxsim branch 4 times, most recently from 1eba71f to 779d774 Compare March 3, 2025 11:54
@usamoi usamoi marked this pull request as ready for review March 3, 2025 11:55
@usamoi usamoi force-pushed the maxsim branch 7 times, most recently from e9d0f0b to e4a97ea Compare March 12, 2025 01:06
@usamoi usamoi requested a review from Copilot March 12, 2025 03:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for the maxsim operator by modifying search and rerank logic, updating tuple serialization (including a version bump and an added field), and adjusting various modules (e.g. prewarm, insert, AM callbacks) to accommodate the new operator. Key changes include:

  • Updating functions in crates/algorithm (search, search_and_estimate, rerank) to handle maxsim semantics.
  • Bumping the tuple version and adding a new “tuples” field to JumpTuple.
  • Adjusting AM and tape operations to use the new IndexPointer and callback conventions.

Reviewed Changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/algorithm/src/search.rs Modified search functions to remove rerank_in_table returns and add search_and_estimate.
crates/algorithm/src/tuples.rs Bumped VERSION to 4 and added a new field (“tuples”) to JumpTupleHeader.
crates/algorithm/src/prewarm.rs Changed the tape read invocation from using frozen_first() to appendable_first().
src/index/am/mod.rs Updated pointer conversion callbacks and opfamily extraction for improved clarity.
Cargo.toml Added dependency on always_equal module.
Comments suppressed due to low confidence (2)

crates/algorithm/src/prewarm.rs:92

  • Verify that replacing the use of jump_tuple.frozen_first() with jump_tuple.appendable_first() in the prewarm function is intentional, as it could affect which tape is read for prewarming.
jump_tuple.appendable_first(),

crates/algorithm/src/tuples.rs:11

  • The tuple version has been bumped from 3 to 4 and the JumpTupleHeader now includes a new 'tuples' field; please ensure that all downstream consumers can handle the new field and consider adding an inline comment explaining its purpose.
const VERSION: u64 = 4;

@usamoi usamoi force-pushed the maxsim branch 6 times, most recently from 3e6c030 to 4490ab5 Compare March 17, 2025 12:26
@usamoi usamoi requested a review from Copilot March 17, 2025 12:27
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a new maxsim operator and indexing functionality while updating various parts of the codebase to support it. Key changes include:

  • Introduction of MaxsimBuilder and related maxsim logic in the index scanners.
  • Updates to configuration (GUC) and operator definitions to support new maxsim operators.
  • Modifications in CI workflows and packaging processes to accommodate new build targets.
  • Updates in algorithm modules (search, rerank, maintain, etc.) to integrate maxsim functionality.

Reviewed Changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.

Show a summary per file
File Description
.github/workflows/check.yml CI configuration updates including runner matrix and artifact packaging
src/index/scanners/maxsim.rs New maxsim operator builder implementation
crates/algorithm/src/search.rs Enhancements in search functions and introduction of search_and_estimate
src/datatype/operators_vector.rs & operators_halfvec.rs Added pg_extern functions for maxsim operator support in vector types
src/index/opclass.rs Extended operator class with maxsim variants
Other crates (algorithm, distance, etc.) Minor updates to support new maxsim features and configuration gucs
Comments suppressed due to low confidence (2)

.github/workflows/check.yml:80

  • Verify that the conditional expression using '&&' and '||' in the 'runs-on' field always evaluates to a valid runner label across all potential values of matrix.arch.
runs-on: ${{ matrix.arch == 'x86_64' && 'ubuntu-22.04' || 'ubuntu-22.04-arm' }}

crates/algorithm/src/rerank.rs:71

  • [nitpick] Consider moving the extraction of vector metadata outside the loop to avoid redundant computation on each iteration, which could improve performance.
let vector = O::Vector::elements_and_metadata(vector.as_borrowed());

@usamoi usamoi force-pushed the maxsim branch 4 times, most recently from e096728 to fcf95f7 Compare March 18, 2025 07:58
usamoi added 4 commits March 18, 2025 16:13
Signed-off-by: usamoi <usamoi@outlook.com>
Signed-off-by: usamoi <usamoi@outlook.com>
Signed-off-by: usamoi <usamoi@outlook.com>
Signed-off-by: usamoi <usamoi@outlook.com>
Signed-off-by: usamoi <usamoi@outlook.com>
@usamoi usamoi merged commit de66694 into tensorchord:main Mar 18, 2025
15 checks passed
@kemingy kemingy mentioned this pull request Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant