-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Don't encode quantization query while hnsw build #6729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't encode quantization query while hnsw build #6729
Conversation
@@ -45,6 +45,8 @@ pub trait EncodedVectors: Sized { | |||
fn score_internal(&self, i: u32, j: u32, hw_counter: &HardwareCounterCell) -> f32; | |||
|
|||
fn quantized_vector_size(&self) -> usize; | |||
|
|||
fn encode_internal_vector(&self, id: u32) -> Option<Self::EncodedQuery>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create query from the internal vector. Query is always in RAM
@@ -574,4 +574,8 @@ impl<TStorage: EncodedStorage> EncodedVectors for EncodedVectorsPQ<TStorage> { | |||
fn quantized_vector_size(&self) -> usize { | |||
self.metadata.vector_division.len() | |||
} | |||
|
|||
fn encode_internal_vector(&self, _id: u32) -> Option<EncodedQueryPQ> { | |||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot create query in PQ from quantized vector without LUT accuracy loss.
hardware_counter: HardwareCounterCell, | ||
} | ||
|
||
impl<'a, TElement, TMetric, TEncodedVectors> | ||
QuantizedQueryScorer<'a, TElement, TMetric, TEncodedVectors> | ||
pub enum QuantizedInternalScorerResult<'a, TEncodedVectors> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Result of internal scorer creation. It can be constructed scorer or nothing but we want to get hardware counter ownership back
📝 WalkthroughWalkthroughThis change introduces a new method, Possibly related PRs
Suggested labels
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
⏰ Context from checks skipped due to timeout of 90000ms (14)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
172-348
: Consider using a macro to reduce code repetition.The implementation correctly handles all storage variants with a consistent pattern. While the repetition is necessary due to Rust's type system, consider extracting the common pattern into a macro to improve maintainability:
macro_rules! handle_internal_scorer { ($quantized_data:expr, $point_id:expr, $hardware_counter:expr, $original_query:expr) => { match QuantizedQueryScorer::new_internal($point_id, $quantized_data, $hardware_counter) { QuantizedInternalScorerResult::Scorer(scorer) => raw_scorer_from_query_scorer(scorer), QuantizedInternalScorerResult::NotSupported(hw) => $original_query(hw), } }; }However, if you prefer explicit code for better readability, the current implementation is acceptable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
lib/quantization/src/encoded_vectors.rs
(1 hunks)lib/quantization/src/encoded_vectors_binary.rs
(1 hunks)lib/quantization/src/encoded_vectors_pq.rs
(1 hunks)lib/quantization/src/encoded_vectors_u8.rs
(1 hunks)lib/segment/src/index/hnsw_index/hnsw.rs
(4 hunks)lib/segment/src/index/hnsw_index/point_scorer.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs
(3 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs
(2 hunks)
🔇 Additional comments (11)
lib/quantization/src/encoded_vectors.rs (1)
49-49
: Well-designed trait method addition.The
encode_internal_vector
method is properly designed with anOption
return type, allowing implementations to opt out when the optimization isn't feasible (e.g., PQ quantization). This aligns perfectly with the PR objective to reuse quantized data during HNSW construction.lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
152-157
: LGTM - Clean generic parameter refactoring.Moving the generic parameters from the struct to the method call is a clean refactoring that maintains the same functionality while supporting the new internal vector encoding capabilities.
lib/quantization/src/encoded_vectors_pq.rs (1)
578-580
: Correct implementation for PQ limitations.Returning
None
is the appropriate behavior for PQ quantization, as noted in the existing comment that creating a query from quantized vector would result in LUT accuracy loss. This maintains the optimization's optional nature.lib/quantization/src/encoded_vectors_binary.rs (1)
573-581
: Solid implementation for binary quantization.The implementation correctly retrieves the encoded vector data, performs the necessary type transmutation, and creates an owned
EncodedBinVector
. The use ofto_vec()
ensures proper ownership semantics for the returned data.lib/quantization/src/encoded_vectors_u8.rs (1)
511-519
: Correct implementation for U8 quantization.The implementation properly extracts both the vector offset and encoded data using
get_vec_ptr
, creates a safe slice from the raw pointer, and constructs an ownedEncodedQueryU8
. This maintains all necessary information for accurate scoring.lib/segment/src/index/hnsw_index/hnsw.rs (1)
420-427
: Good optimization for quantized vector scoring.The refactoring to use
FilteredScorer::new_internal
avoids unnecessary vector loading and allows the scorer to work directly with quantized data when available, improving performance.lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
296-304
: Correct implementation of internal vector encoding for multivectors.The method properly handles the multivector case by encoding each internal vector and collecting the results. The early return pattern ensures consistent error handling.
lib/segment/src/index/hnsw_index/point_scorer.rs (1)
77-108
: Well-designed internal scorer constructor with lazy vector loading.The implementation efficiently handles both quantized and non-quantized cases. The closure pattern for
original_query_fn
ensures the vector is only loaded when necessary, avoiding unnecessary memory operations for quantized vectors.lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (3)
21-27
: Clean result type for internal scorer creation.The
QuantizedInternalScorerResult
enum provides a clear way to handle cases where internal vector encoding is not supported, while preserving hardware counter ownership as noted in the existing comment.
33-59
: Good refactoring: method-level generics improve flexibility.Moving the generic parameters from the struct to the method level allows for more flexible usage patterns while maintaining type safety.
61-76
: Efficient internal scorer construction.The method correctly handles the optional encoding case and properly configures the hardware counter's IO multiplier based on storage location.
@@ -507,6 +507,16 @@ impl<TStorage: EncodedStorage> EncodedVectors for EncodedVectorsU8<TStorage> { | |||
fn quantized_vector_size(&self) -> usize { | |||
self.metadata.vector_parameters.dim | |||
} | |||
|
|||
fn encode_internal_vector(&self, id: u32) -> Option<EncodedQueryU8> { | |||
let (query_offset, q_ptr) = self.get_vec_ptr(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToDo for another PR: avoid usage of pointers. Pretty suze zero-copy can handle it just fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes are not exactly pretty, but I don't have any better suggestions
* dont encode quantization query while hnsw build * add comments --------- Co-authored-by: generall <andrey@vasnetsov.com>
Currently, we do quantization encoding each time we create
RawScorer
. In the case of HNSW construction, it is not required because we already have a quantized storage. We want to reuse already quantized data in HNSW construction and avoid unnecessary access to original vector data, which can be on disk.This PR fixes this behaviour and uses already constructed quantized storage in raw scorer creation.
To achieve this, this PR introduces a new method in quantization storage:
fn encode_internal_vector(&self, id: u32) -> Option<Self::EncodedQuery>;
. It takes a point id and returns an encoded query, which is used in the query scorer. It's optional because not every case can be done that way. In PQ we still want to encode the original vector because using the encoded one will produce accuracy loss in LUT. Everything is fine in the case of SQ and BQ?We don't call
score_internal
directly, instead this idea of a query getter because quantized data can be stored on disk and we want to have query vector always in RAM.Quantized vector scorer has a new construction method which returns scorer or fall bach the ownership of hardware counter. Also a new constructor has a Filtered Scorer.