-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix vector_io_read for multivector recommendation #6622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes introduce new multi-vector quantized scorer modules and types, including Possibly related PRs
Suggested reviewers
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Updated new results measurements to 10k experiments for a fair comparison. |
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
Outdated
Show resolved
Hide resolved
lib/segment/src/vector_storage/query_scorer/sparse_custom_query_scorer.rs
Show resolved
Hide resolved
ec7c891
to
07b1097
Compare
I would wait for the version without disposable, if it is possible |
I do have a version without disposable locally which works fine for the quantized multivectors case. Investigating right now 🔍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
lib/quantization/src/encoded_vectors.rs
(1 hunks)lib/quantization/src/encoded_vectors_binary.rs
(1 hunks)lib/quantization/src/encoded_vectors_pq.rs
(1 hunks)lib/quantization/src/encoded_vectors_u8.rs
(1 hunks)lib/segment/src/vector_storage/quantized/mod.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs
(2 hunks)lib/segment/src/vector_storage/quantized/quantized_multi_custom_query_scorer.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multi_query_scorer.rs
(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
(2 hunks)lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs
(2 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs
(7 hunks)
✅ Files skipped from review due to trivial changes (1)
- lib/segment/src/vector_storage/quantized/mod.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
🧰 Additional context used
🧬 Code Graph Analysis (3)
lib/quantization/src/encoded_vectors.rs (4)
lib/quantization/src/encoded_vectors_pq.rs (1)
quantized_vector_size
(572-574)lib/quantization/src/encoded_vectors_u8.rs (1)
quantized_vector_size
(505-507)lib/quantization/src/encoded_vectors_binary.rs (1)
quantized_vector_size
(359-361)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
quantized_vector_size
(300-302)
lib/quantization/src/encoded_vectors_pq.rs (4)
lib/quantization/src/encoded_vectors.rs (1)
quantized_vector_size
(40-40)lib/quantization/src/encoded_vectors_u8.rs (1)
quantized_vector_size
(505-507)lib/quantization/src/encoded_vectors_binary.rs (1)
quantized_vector_size
(359-361)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
quantized_vector_size
(300-302)
lib/quantization/src/encoded_vectors_u8.rs (4)
lib/quantization/src/encoded_vectors.rs (1)
quantized_vector_size
(40-40)lib/quantization/src/encoded_vectors_pq.rs (1)
quantized_vector_size
(572-574)lib/quantization/src/encoded_vectors_binary.rs (1)
quantized_vector_size
(359-361)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
quantized_vector_size
(300-302)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: storage-compat-test
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: integration-tests-consensus
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: integration-tests
- GitHub Check: lint
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (14)
lib/quantization/src/encoded_vectors.rs (1)
40-40
: LGTM! Clean trait method addition.The new
quantized_vector_size
method provides a standardized interface for accessing quantized vector sizes across all encoded vector implementations, supporting the hardware counter refactoring.lib/quantization/src/encoded_vectors_pq.rs (1)
572-574
: LGTM! Correct PQ quantized vector size implementation.The implementation correctly returns the length of vector division, which represents the number of quantized components in Product Quantization. This aligns with the PQ algorithm where vectors are divided into chunks and each chunk is quantized to a centroid index.
lib/quantization/src/encoded_vectors_u8.rs (1)
505-507
: LGTM! Correct U8 quantized vector size implementation.The implementation correctly returns the vector dimension since in U8 scalar quantization, each dimension is quantized to a single byte, making the quantized vector size equal to the original dimension.
lib/quantization/src/encoded_vectors_binary.rs (1)
359-361
: LGTM! Good delegation to existing logic.The implementation correctly delegates to the existing
get_quantized_vector_size()
method, avoiding code duplication and reusing the established binary quantization size calculation logic.lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2)
67-69
: LGTM! Hardware counter increment properly tracks IO reads.The addition of the hardware counter increment before scoring ensures accurate tracking of vector IO reads. This aligns with the PR objective to fix the measurement issue where cached reads weren't being properly accounted for.
8-8
: Good simplification to single-vector focus.The import changes appropriately reflect the removal of multi-vector support, making this scorer dedicated to single-vector operations only.
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (3)
16-17
: Correct imports for new multi-vector scorer types.The addition of imports for
QuantizedMultiCustomQueryScorer
andQuantizedMultiQueryScorer
properly supports the refactoring to dedicated multi-vector scorers.
206-206
: Good parameter rename for clarity.Renaming the parameter from
quantized_storage
toquantized_multivector_storage
better reflects its specific purpose in multi-vector scoring contexts.
224-230
:✅ Verification successful
Verify the new multi-vector scorer constructor signatures.
The update to use
QuantizedMultiQueryScorer
withnew_multi
constructor looks correct. Ensure the new scorer types are properly implemented with the expected constructor signatures.
🏁 Script executed:
#!/bin/bash # Description: Verify the new multi-vector scorer types have the expected constructors # Check for QuantizedMultiQueryScorer implementation ast-grep --pattern 'impl $_ QuantizedMultiQueryScorer $_ { $$$ pub fn new_multi($_) $_ { $$$ } $$$ }' # Check for QuantizedMultiCustomQueryScorer implementation ast-grep --pattern 'impl $_ QuantizedMultiCustomQueryScorer $_ { $$$ pub fn new_multi($_) $_ { $$$ } $$$ }'Length of output: 237
🏁 Script executed:
#!/bin/bash # Search for the scorer types and their new_multi constructors set +e echo "Searching for QuantizedMultiQueryScorer definitions and usages..." rg -n "QuantizedMultiQueryScorer" -A5 -B5 echo "---" echo "Searching for QuantizedMultiCustomQueryScorer definitions and usages..." rg -n "QuantizedMultiCustomQueryScorer" -A5 -B5 echo "---" echo "Searching for all `pub fn new_multi` implementations..." rg -n "pub fn new_multi" -A5 -B5Length of output: 21301
Constructors Verified:
new_multi
Signatures Match UsageBoth
QuantizedMultiQueryScorer::new_multi
andQuantizedMultiCustomQueryScorer::new_multi
have the correct parameter lists and return types as used inquantized_scorer_builder.rs
. No changes are needed.–
lib/segment/src/vector_storage/quantized/quantized_multi_query_scorer.rs
defines
pub fn new_multi(&MultiDenseVectorInternal, &TEncodedVectors, &QuantizationConfig, HardwareCounterCell) -> Self
–
lib/segment/src/vector_storage/quantized/quantized_multi_custom_query_scorer.rs
defines
pub fn new_multi<TOriginalQuery, TInputQuery>(TInputQuery, &TEncodedVectors, &QuantizationConfig, HardwareCounterCell) -> Self
Both match their invocations in
quantized_scorer_builder.rs
.lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (2)
95-97
: Consistent hardware counter tracking implementation.The hardware counter increment before scoring matches the pattern in
quantized_query_scorer.rs
, ensuring consistent IO measurement across all quantized scorers.
8-8
: Clean removal of multi-vector dependencies.The import simplification appropriately reflects the focus on single-vector operations, consistent with the overall refactoring strategy.
lib/segment/src/vector_storage/quantized/quantized_multi_custom_query_scorer.rs (1)
112-118
: Good implementation of hardware counter delegation.The delegation of scoring to the quantized multivector storage, along with passing the hardware counter, aligns well with the PR objective of moving IO measurement to the multivector storage level for accurate tracking.
lib/segment/src/vector_storage/quantized/quantized_multi_query_scorer.rs (2)
73-77
: Well-structured delegation pattern.The implementation correctly delegates scoring operations to the quantized multivector storage while passing the hardware counter. This ensures accurate IO measurement at the storage level as intended by the PR.
83-86
: Appropriate implementation of internal scoring.Unlike the custom query scorer, this implementation correctly delegates internal scoring to the storage layer, which is the expected behavior for standard multi-query scenarios.
lib/segment/src/vector_storage/quantized/quantized_multi_custom_query_scorer.rs
Show resolved
Hide resolved
The latest commit 81374af is implementing the fix without the use of a disposable counter. The increment of IO counters is removed from all non multivector quantized storage.
The quantized storage are now responsible for exposing the quantized vector size However, part of the fix is having the multivector quantized storage reports the IO counter as well. To avoid double IO counting at the query scorer level, a new type of specialized quantized multivector query scorer is introduced where no measurement is performed.
This was tested manually. |
@@ -201,7 +203,7 @@ impl<'a> QuantizedScorerBuilder<'a> { | |||
|
|||
fn new_multi_quantized_scorer<TElement, TMetric, TEncodedQuery>( | |||
self, | |||
quantized_storage: &'a impl EncodedVectors<TEncodedQuery>, | |||
quantized_multivector_storage: &'a impl EncodedVectors<TEncodedQuery>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed for clarity
* Fix vector_io_read for multivector recommendation * fix * cleanup * single disposable * do not use disposable hardware counter for quantized multivectors
This PR fixes the
vector_io_read
hardware counter when using a recommendation query with several examples against a quantized multivector storage.The root of the issue is the multivector storage wrapping a quantized storage where the hardware measurement is taking place.
However, the quadratic scoring happening for multivectors is triggering an excess of IO measurement in the quantized storage.
Long story short, the same vectors are fetched several times in a row but won't show up in the kernel IO measurement because of caching.
The proposed fix is to perform the hardware measurement within the multivector storage instead.
Example
A recommend
average_score
query with 5 positives examples which are getting merged..Each example is a multivector with 4 vectors forming an internal query of 20 vectors.
Each sub-vector from the query will be executed against each sub-vector of the target point (80 times in total).
It means each sub-vector of the target point is fetched 20 times.
I have seen this effect in action while scoring 5080 points from this example which used to be reported as 52MB
vector_io_read
.It is now back to the expected value
5080 multivec * 4 sub-vec * 128 dim * 1 bytes (unint8) = 2,6 MB
.Pricer model
The issue has been originally reported by Pricer.
Below the results for 10k experiments before and after the fix where the outliers on the left hand disappear.