Skip to content

Conversation

usamoi
Copy link
Contributor

@usamoi usamoi commented Feb 20, 2025

split data vectors to two tapes, one is immutable since maintain, and another one is appendable for newly-inserted vectors

@VoVAllen
Copy link
Member

what's this

@usamoi
Copy link
Contributor Author

usamoi commented Feb 20, 2025

what's this

Changes needed by multi-vector index.

@VoVAllen
Copy link
Member

please add more detailed descriptions in the pr

Signed-off-by: usamoi <usamoi@outlook.com>
@usamoi usamoi merged commit d16d469 into tensorchord:main Feb 21, 2025
15 checks passed
@usamoi usamoi requested a review from Copilot March 12, 2025 03:52
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the handling of quantized data vectors by splitting them into two tapes: one immutable (frozen) and one appendable. Key changes include updates to tape writer construction and usage in multiple modules, removal of the Pipe trait, and renaming of functions and types to reflect the new split-tape design.

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/algorithm/src/build.rs Splits the tape writer into frozen and appendable tapes; updates branch fields accordingly.
crates/algorithm/src/bulkdelete.rs Refactors tuple deserialization to use new API methods.
crates/algorithm/src/operator.rs Introduces a new FunctionalAccessor for improved functional composition.
crates/algorithm/src/lib.rs Removes the Pipe module and adds a new Branch struct for tape management.
crates/algorithm/src/maintain.rs Updates the maintain function to use the new FrozenTapeWriter.
crates/algorithm/src/cache.rs Replaces deprecated Pipe usage with direct deserialization calls.
crates/algorithm/src/search.rs Adjusts tape reading calls to use updated functions and error handling.
crates/algorithm/src/vectors.rs Renames access functions to read_for_h1_tuple and adjusts tuple handling.
crates/algorithm/src/insert.rs Refactors insert logic to use new tape append functions.
crates/algorithm/src/freepages.rs Simplifies freepage handling with direct serialization/deserialization usage.
crates/algorithm/src/tape.rs Refactors public API functions and internal tape writer usage to align with changes.
crates/algorithm/src/prewarm.rs Updates prewarm logic to use new read_for_h1_tuple and functional access patterns.
crates/algorithm/src/rerank.rs Updates reranking logic to use the revised tuple access interface.
crates/algorithm/src/pipe.rs Removes the Pipe trait as part of the overall refactor.
Comments suppressed due to low confidence (2)

crates/algorithm/src/freepages.rs:9

  • The loop condition 'while pages.is_empty()' may be incorrect because it will execute only when 'pages' is empty. If the intent is to process provided pages, consider changing the condition to 'while !pages.is_empty()'.
while pages.is_empty() {

crates/algorithm/src/build.rs:56

  • [nitpick] Verify that the new field name 'frozen_first' clearly indicates that it represents the starting pointer of the immutable tape. If necessary, consider adding comments or renaming related fields to differentiate it from the appendable tape pointer.
                    frozen_first: frozen_tape.first(),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants