Skip to content

Conversation

xzfc
Copy link
Contributor

@xzfc xzfc commented Mar 24, 2025

Current version

Currently, we have the following traits/structs:

  • RawScorer filters points by deletion flags, then scores them.
  • FilteredScorer filters points by FilterContext, then pass them to RawScorer to score.
    Some methods of FilteredScorer simply call the corresponding methods of RawScorer.

Both RawScorer and FilteredScorer filter points, and both of them perform the scoring.

In this PR

In this PR, I moved a lot of logic from RawScorer to FilteredScorer.

  • RawScorer::score_points() doesn't filter points anymore.
  • point_deleted, vec_deleted bitslices are moved to FilteredScorer.
    Consequence: FilteredScorer::check_vector() now should be a bit faster.
    • It no longer performs a dyn trait call to RawScorer.
    • Now these bitslices are checked before checking FilterContext, not after.
  • peek_top_iter() and peek_top_all() methods are moved to FilteredScorer.
  • A lot of tests that use RawScorer updated to use FilteredScorer instead.

Removed temporary buffer FilteredScorer::points_buffer

Not in this version of PR.

What was in this PR previously? Minor change: this PR changes the interface of the scorer and the filterer to avoid the need for a temporary buffer.

Previously:

  1. Caller prepares a vector of PointOffsetType and passes it to FilteredScorer.
  2. FilteredScorer
    • filters in-place this vector (destroying the original values)
    • allocates a buffer of type Vec<ScoredPointOffset> (reusable, stored in FilteredScorer)
    • calls RawScorer::score_points with both of these vectors.
  3. RawScorer::score_points also performs its own filtering. It returns a number of scored points.
  4. FilteredScorer returns buffer slice truncated by this number to the caller.

In this PR:

  1. Caller prepares a vector of ScoredPointOffset.
    Only the .idx field should be set; the .score field should be set to an arbitrary meaningless value.
  2. PointsFilterer filters this vector in-place using Vec::retain().
  3. RawScorer::score_points fills the .score values in-place.
Why it is no longer in this PR?

In the current version of the PR, RawScorerImpl::score_points calls QueryScorer::score_stored_batch, which accepts two separate slices.

Doing the same for QueryScorer would be complicated (but possible I guess).
Illustrated: if you replace ids: &[PointOffsetType], scores: &mut [ScoreType] with ids_with_scores: &mut [ScoredPointOffset], then you can't just pass ids to get_dense_batch in the following code:

let mut vectors = [MaybeUninit::uninit(); VECTOR_READ_BATCH_SIZE];
let vectors = self
.vector_storage
.get_dense_batch(ids, &mut vectors[..ids.len()]);

Performance

BFB on small (32 element) vectors: 3612 RPS -> 4721 RPS.
No significant changes on 1.5k vectors.

https://github.com/qdrant/vector-db-benchmark/actions/runs/14606832003

Criterion benchmark results

A lot of benchmarks got a huge boost, but the benchmark logic it's not exactly the same, so take it with a grain of salt:

  • Benchmarks that use TestRawScorerProducer no longer use FakeFilterContext (so one less dyn call in FilteredScorer for them).
  • FilteredScorer::score_points() now returns an iterator, not a slice.
Results

hnsw_build_graph:

Benchmarking hnsw-index-build-group/hnsw_index: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 9.2s.
hnsw-index-build-group/hnsw_index
                        time:   [915.87 ms 920.81 ms 926.33 ms]
                        change: [-3.5310% -2.8506% -2.0601%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild

hnsw_build_asymptotic:

hnsw-index-build-asymptotic/build-n-search-hnsw-5k
                        time:   [43.190 µs 43.231 µs 43.275 µs]
                        change: [-5.8648% -5.5461% -5.1992%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
hnsw-index-build-asymptotic/build-n-search-hnsw-1M
                        time:   [143.51 µs 143.76 µs 144.02 µs]
                        change: [-10.849% -10.650% -10.448%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
hnsw-index-build-asymptotic/build-n-search-hnsw-1M-score-point
                        time:   [68.044 µs 68.103 µs 68.162 µs]
                        change: [-18.860% -18.676% -18.505%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

scoring-vector/score-point
                        time:   [19.259 µs 19.275 µs 19.291 µs]
                        change: [-23.060% -22.902% -22.749%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
scoring-vector/score-point-10x
                        time:   [19.414 µs 19.448 µs 19.485 µs]
                        change: [-25.670% -25.517% -25.364%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild
scoring-vector/score-point-50x
                        time:   [55.111 µs 55.168 µs 55.227 µs]
                        change: [-23.943% -23.653% -23.373%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

scoring-vector/basic-score-point
                        time:   [144.57 µs 144.70 µs 144.82 µs]
                        change: [-0.8233% -0.6519% -0.4308%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
scoring-vector/basic-score-point-10x
                        time:   [155.73 µs 155.89 µs 156.04 µs]
                        change: [-1.2111% -0.9589% -0.7167%] (p = 0.00 < 0.05)
                        Change within noise threshold.

hnsw_search_graph:

hnsw-search-graph/uncompressed
                        time:   [357.41 µs 358.14 µs 358.86 µs]
                        change: [-17.125% -16.858% -16.586%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
hnsw-search-graph/compressed
                        time:   [360.08 µs 360.91 µs 361.75 µs]
                        change: [-16.297% -16.001% -15.667%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
hnsw-search-graph/plain time:   [19.076 ms 19.113 ms 19.156 ms]
                        change: [-5.7350% -5.5325% -5.3217%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Performance on glove-100

Made 30k unique batch search requests with 10 searches in each. Times for each batch are in ms.

No filter:

before after
min 10.017 8.731
max 20.559 22.528
mean 12.683 11.020
median 12.380 10.792
p95 15.680 13.445
p99 17.600 15.077

With no-op filter:

before after
min 10.724 9.445
max 20.958 22.160
mean 13.595 11.986
median 13.267 11.700
p95 16.967 14.755
p99 19.192 16.051

Request for no filter:

POST collections/{collection}/points/search/batch
{
  "searches": [
    {"vector": <vector>, "limit": 10 },
    … (9 more) …
  ]
}

Request for no-op filter:

POST collections/{collection}/points/search/batch
{
  "searches": [
    {"vector": <vector>, "limit": 10, "filter": {"must_not": {"has_id": [1]}} },
    … (9 more) …
  ]
}

Why

The new interface would be useful in incremental HNSW build as we would need to search without filtering.

Filter order

Another potential improvement that this PR would allow described below. Suppose we perform an unfiltered search within a graph that has additional links. See this code: https://github.com/qdrant/qdrant/blob/7abc684361fb81d8b62cf1554d8bf4fb65a706d7/lib/segment/src/index/hnsw_index/graph_layers.rs#L73-L80 This logic here is NOT: "use the first `m` neighbors aka main links"; Instead, the implemented logic is: "use the first `m` non-visited neighbors, i.e. take additional links if the main ones have already been visited". It seems to me that the implemented logic imposes additional unnecessary work. I believe that we should flip filters like this (pseudo code):
// old
point_ids.filter(is_not_visited).filter(is_not_deleted).take(m)
// new
point_ids.filter(is_not_deleted).take(m).filter(is_not_visited)

@xzfc xzfc requested review from timvisee and generall March 24, 2025 19:53
@xzfc xzfc force-pushed the separate-scorer-filterer branch from 9bf697e to 0586a80 Compare March 24, 2025 19:55
Copy link
Contributor

coderabbitai bot commented Mar 24, 2025

📝 Walkthrough

Walkthrough

This change set performs a comprehensive refactor of scoring and filtering logic throughout the vector search segment, focusing on the consolidation and simplification of scorer construction and usage. The core change is the redesign of the FilteredScorer abstraction: it now owns its raw scorer, manages both point and vector deletion bit slices, and encapsulates filter context handling via a new BoxCow enum for flexible ownership. The FilteredScorer constructor is expanded to handle quantized and non-quantized vectors, filter contexts, and deletion masks in a unified manner, replacing scattered logic and explicit filter context wrappers.

All usages of FakeFilterContext and manual wrapping of raw scorers with filter contexts are removed across benchmarks, tests, and production code. Scorer acquisition is streamlined—benchmarks, tests, and index implementations now directly obtain a scorer via a single method or constructor. The RawScorer trait and its implementations are simplified, with the removal of deleted-point filtering logic and the refactoring of scoring methods to operate on pre-filtered points. Related changes propagate to quantized scorer builders, vector storage, and index modules, ensuring consistent scorer usage and interface signatures.

Suggested reviewers

  • timvisee
  • generall

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2bce5a and 45a9cc3.

📒 Files selected for processing (1)
  • lib/segment/src/vector_storage/async_raw_scorer.rs (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (6)
lib/segment/src/vector_storage/async_raw_scorer.rs (6)

8-8: Importing error handling types

Good addition of OperationError and OperationResult imports, which aligns with proper error handling in the refactored code.


24-24: Simplified function call

The constructor call has been simplified by removing deleted points and vectors parameters, which matches the PR objective of moving filtering logic to FilteredScorer.


36-41: Simplified AsyncRawScorerImpl constructor

The constructor has been refactored to only require query_scorer and storage, without deleted points bitslices. This aligns with the PR objective of moving filtering responsibility to FilteredScorer.


88-88: Simplified distance retrieval

Good simplification of getting the distance from storage.


121-121: Improved code reuse with helper function

Good refactoring to use the new helper function async_raw_scorer_from_query_scorer across all query vector types, which reduces code duplication and improves maintainability.

Also applies to: 140-140, 149-149, 159-159, 168-168


174-185: Well-designed helper function

The new helper function centralizes the creation of AsyncRawScorerImpl instances, making the code more maintainable and following the DRY principle. The function has appropriate type constraints and lifetime annotations.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (6)
lib/segment/src/index/hnsw_index/tests/mod.rs (1)

41-44: Refactored to separate scoring and filtering

The code now retrieves both a raw scorer and a filterer using get_scorer_and_filterer, replacing the previous approach with FakeFilterContext and FilteredScorer. This change is consistent with the PR's objective of decoupling scoring and filtering mechanisms.

The clone() on added_vector might be unnecessary since the vector is already passed by value. Consider reviewing if this clone operation is required:

-        let (raw_scorer, filterer) = vector_holder
-            .get_scorer_and_filterer(added_vector.clone())
-            .unwrap();
+        let (raw_scorer, filterer) = vector_holder
+            .get_scorer_and_filterer(added_vector)
+            .unwrap();
lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs (1)

677-681: Improved result vector preparation with new_unset constructor

The implementation now uses a more direct approach to prepare the result vector by mapping over query points and using the new ScoredPointOffset::new_unset() constructor. This eliminates the need for vector resizing and makes the code more readable and efficient.

Consider adding a comment explaining that using new_unset() initializes the score with a placeholder value (0.0) that will be updated by score_points() to help future developers understand the flow.

  let mut res = query_points
      .iter()
      .map(|&idx| ScoredPointOffset::new_unset(idx))
      .collect::<Vec<_>>();
+ // The scores are initialized with placeholder values and will be updated by score_points
  scorer.score_points(&mut res);
lib/segment/src/vector_storage/async_raw_scorer.rs (1)

73-88: Refined score_points API
Now score_points directly updates the scores array in place, simplifying the interface. The early return on is_stopped is good for responsiveness. Using Cell for in-place mutation in the closure is well-implemented.

Consider providing a fallback strategy if io_uring is not supported, e.g. logging a warning or using the synchronous path, to avoid unexpected panic in production.

lib/segment/src/index/hnsw_index/point_filterer.rs (1)

58-69: Fix spelling in documentation.

The words performace and indended are misspelled in these lines.

Please apply this diff to correct them:

-/// This method operates on [`ScoredPointOffset`] for performace reasons.
-/// The indended usage is to prepare a vector of [`ScoredPointOffset`]
+/// This method operates on [`ScoredPointOffset`] for performance reasons.
+/// The intended usage is to prepare a vector of [`ScoredPointOffset`]
🧰 Tools
🪛 GitHub Check: Check for spelling errors

[failure] 61-61:
performace ==> performance


[failure] 62-62:
indended ==> intended, indented

🪛 GitHub Actions: Codespell

[warning] 61-61: codespell: 'performace' is a misspelling. Suggested correction: 'performance'.

lib/segment/benches/hnsw_build_asymptotic.rs (1)

35-35: Handle potential panic on .unwrap().

Each of these lines calls .unwrap(), which will panic if get_scorer_and_filterer returns an error. While it may be acceptable in benchmarking code, using .expect("<message>") or an error-handling strategy can provide clearer diagnostics when failures occur.

- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(query).unwrap();
+ let (raw_scorer, filterer) = vector_holder
+     .get_scorer_and_filterer(query)
+     .expect("Failed to retrieve scorer and filterer");

Also applies to: 56-56, 63-63, 72-72, 80-80, 94-94, 111-111, 127-127, 143-143

lib/segment/src/index/hnsw_index/graph_layers.rs (1)

425-425: Avoid potential panic on .unwrap().

Using .unwrap() can cause unwanted panics if get_scorer_and_filterer fails. Consider a more descriptive .expect() or a proper error-handling approach to make failures more explicit.

- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(added_vector).unwrap();
+ let (raw_scorer, filterer) = vector_holder
+     .get_scorer_and_filterer(added_vector)
+     .expect("Failed to retrieve scorer and filterer");
🛑 Comments failed to post (1)
lib/segment/benches/hnsw_build_graph.rs (1)

31-33: 💡 Verification agent

🧩 Analysis chain

Refactored to separate scoring and filtering

The code now retrieves both a raw scorer and a filterer directly from the vector holder, replacing the previous approach of using FakeFilterContext and FilteredScorer. This change aligns with the PR's objective of decoupling scoring and filtering mechanisms.

Consider handling potential errors from get_scorer_and_filterer more gracefully instead of using unwrap(), which could cause panics in case of failure:


🏁 Script executed:

#!/bin/bash
# Search for other instances of error handling with get_scorer_and_filterer
rg "get_scorer_and_filterer.*unwrap" --type rust

Length of output: 2698


Enhance Error Handling for get_scorer_and_filterer

The refactor effectively decouples scoring and filtering by retrieving both components directly from the vector holder. However, our search indicates that using unwrap() with get_scorer_and_filterer is pervasive across the codebase (including in this file, e.g., in lib/segment/benches/hnsw_build_graph.rs on lines 31–33, and elsewhere). This approach could lead to panics if the function ever fails.

  • Action Items:
    • Review and refactor error handling for get_scorer_and_filterer both here and in other affected files (e.g., in lib/segment/src/index/hnsw_index/graph_layers.rs and lib/segment/src/index/hnsw_index/graph_layers_builder.rs).
    • Consider returning more informative errors or using methods like expect() with a clear message, or refactor to propagate errors gracefully.

@coszio coszio self-requested a review March 24, 2025 20:02
Some(f) => scores
.retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)),
}
if limit != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this condition is needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we put it at the beginning of the function body without the condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant code in the current version of this PR:

pub fn score_points(
&mut self,
point_ids: &mut Vec<PointOffsetType>,
limit: usize,
) -> impl Iterator<Item = ScoredPointOffset> {
point_ids.retain(|point_id| self.check_vector(*point_id));
if limit != 0 {
point_ids.truncate(limit);
}


limit == 0 means "don't truncate scores". I also don't like, but this logic was there before this PR. I've added a comment in FilteredScorer::score_points about this.

(maybe limit: usize can be replaced with Option<NonZeroUsize>, but I don't want more bloat in this PR)

Shall we put it at the beginning of the function body without the condition?

No, retain-then-retain may produce a different result than truncate-then-retain.

@generall
Copy link
Member

After review in-call:

  • Try to merge Filterer and RawScorer in the following way:

FIlteredScorer uses RawScorer to combine into a single call

            filterer.filter_scores(&mut scores, limit);
            scorer.score_points(&mut scores);

If RawScorer is needed without filters, it can be extracted (used from) FilteredScorer

  • is_stopped flag should be moved FilteredScorer
  • Move peek_top_iter and peek_top_all into FilteredScorer

For a separate PR:

  • Try to use query_scorer.score_stored_batch in score_points of RawScorerImpl

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

lib/segment/src/vector_storage/raw_scorer.rs:941

  • [nitpick] Consider renaming this method (or updating its documentation) to emphasize that it fills an existing scores buffer in-place instead of returning a new vector; this would clarify that the scores slice must be pre-initialized (e.g. using ScoredPointOffset::new_unset).
fn score_points(&self, scores: &mut [ScoredPointOffset]) {

lib/segment/src/fixtures/index_fixtures.rs:170

  • [nitpick] Consider refactoring the repeated pattern of obtaining a RawScorer and its corresponding PointsFilterer into a dedicated helper function or utility to reduce boilerplate at call sites.
pub fn get_scorer_and_filterer(

Some(f) => scores
.retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)),
}
if limit != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we put it at the beginning of the function body without the condition?

@xzfc xzfc marked this pull request as draft April 14, 2025 10:24
@xzfc xzfc force-pushed the separate-scorer-filterer branch from 0586a80 to 972a598 Compare April 17, 2025 11:31
@xzfc xzfc force-pushed the separate-scorer-filterer branch from 972a598 to e2bce5a Compare April 22, 2025 19:05
@xzfc xzfc changed the title Separate scorer and filterer Move filtering logic from RawScorer to FilteredScorer Apr 22, 2025
@xzfc xzfc marked this pull request as ready for review April 22, 2025 22:45
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
lib/common/common/src/iterator_ext/mod.rs (1)

47-55: Well-implemented benchmarking utility!

The black_box method is a good addition for benchmarking purposes. It correctly applies std::hint::black_box to each item in the iterator to prevent compiler optimizations during benchmarks.

If you prefer a more concise style, the closure could be simplified to:

-    self.for_each(|p| {
-        std::hint::black_box(p);
-    });
+    self.for_each(std::hint::black_box);

However, the current implementation is explicit and clear, so this is purely optional.

lib/segment/src/index/sparse_index/sparse_vector_index.rs (2)

302-309: Leverage quantized vectors if present for cheaper scoring

FilteredScorer::new accepts an optional quantized_vectors parameter, but we always pass None.
When the underlying storage has a pre‑built quantized representation, using it typically reduces RAM traffic and SIMD cost during scoring.

-        let scorer = FilteredScorer::new(
-            query_vector.clone(),
-            &vector_storage,
-            None,
-            None,
-            deleted_point_bitslice,
-            vector_query_context.hardware_counter(),
-        )?;
+        let scorer = FilteredScorer::new(
+            query_vector.clone(),
+            &vector_storage,
+            vector_storage.quantized_vectors(),   // <- returns `Option<&QuantizedVectors>`
+            None,
+            deleted_point_bitslice,
+            vector_query_context.hardware_counter(),
+        )?;

(If quantized_vectors() is not available on VectorStorageEnum, you can expose it behind a helper.)
This is a pure optimisation; functional behaviour remains identical.


318-323: Passing the iterator by reference is unnecessary

peek_top_iter takes its iterator by value, and &mut I implements Iterator only indirectly.
Simply move the iterator to the call‑site for slightly cleaner code and avoid the extra reference indirection:

-                let res = scorer.peek_top_iter(&mut filtered_points, top, &is_stopped)?;
+                let res = scorer.peek_top_iter(filtered_points, top, &is_stopped)?;
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)

52-83: Heavy match‑cascade could be centralised

The triple‑nested match on (datatype, distance) duplicates very similar branches 18 times.
Consider a helper such as

fn route<TElement, TMetric>(self, builder: fn(Self) -> OperationResult<Box<dyn RawScorer<'a>>>)

or a small macro to collapse repetition.
Less duplication eases future distance additions (e.g. CosineSquared) and reduces the chance of forgetting one of the 18 arms.

lib/segment/src/index/hnsw_index/point_scorer.rs (2)

143-152: Slice scores_buffer to avoid scanning unused entries

zip(&*point_ids, &self.scores_buffer) iterates the entire buffer, although only the first point_ids.len() slots are freshly written.
Use the same slice you gave to score_points:

-        std::iter::zip(&*point_ids, &self.scores_buffer)
+        std::iter::zip(&*point_ids, &self.scores_buffer[..point_ids.len()])

Saves needless cache traffic when the buffer had previously grown large.


147-152: scores_buffer never shrinks – potential long‑lived memory spike

The buffer only ever grows; if one request with 1 M points arrives, every subsequent tiny request will still carry a 1 M‑entry buffer.

Consider resetting it when point_ids.len() falls under a fraction of current capacity, or allocate fresh per call and reuse via thread‑local pool.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0586a80 and e2bce5a.

📒 Files selected for processing (24)
  • lib/common/common/src/iterator_ext/mod.rs (1 hunks)
  • lib/segment/benches/fixture.rs (2 hunks)
  • lib/segment/benches/hnsw_build_asymptotic.rs (7 hunks)
  • lib/segment/benches/hnsw_build_graph.rs (2 hunks)
  • lib/segment/benches/hnsw_search_graph.rs (4 hunks)
  • lib/segment/benches/scorer_mmap.rs (2 hunks)
  • lib/segment/benches/vector_search.rs (3 hunks)
  • lib/segment/src/fixtures/index_fixtures.rs (2 hunks)
  • lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs (3 hunks)
  • lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (5 hunks)
  • lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs (1 hunks)
  • lib/segment/src/index/hnsw_index/gpu/mod.rs (3 hunks)
  • lib/segment/src/index/hnsw_index/graph_layers.rs (6 hunks)
  • lib/segment/src/index/hnsw_index/graph_layers_builder.rs (7 hunks)
  • lib/segment/src/index/hnsw_index/hnsw.rs (13 hunks)
  • lib/segment/src/index/hnsw_index/point_scorer.rs (2 hunks)
  • lib/segment/src/index/hnsw_index/tests/mod.rs (2 hunks)
  • lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs (2 hunks)
  • lib/segment/src/index/plain_vector_index.rs (4 hunks)
  • lib/segment/src/index/sparse_index/sparse_vector_index.rs (4 hunks)
  • lib/segment/src/vector_storage/async_raw_scorer.rs (9 hunks)
  • lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs (9 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (10 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_vectors.rs (0 hunks)
💤 Files with no reviewable changes (1)
  • lib/segment/src/vector_storage/quantized/quantized_vectors.rs
✅ Files skipped from review due to trivial changes (3)
  • lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs
  • lib/segment/src/index/hnsw_index/graph_layers.rs
  • lib/segment/src/index/hnsw_index/graph_layers_builder.rs
🚧 Files skipped from review as they are similar to previous changes (10)
  • lib/segment/src/index/hnsw_index/gpu/mod.rs
  • lib/segment/benches/hnsw_build_graph.rs
  • lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs
  • lib/segment/src/index/hnsw_index/tests/mod.rs
  • lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs
  • lib/segment/benches/hnsw_search_graph.rs
  • lib/segment/src/fixtures/index_fixtures.rs
  • lib/segment/benches/hnsw_build_asymptotic.rs
  • lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs
  • lib/segment/src/index/hnsw_index/hnsw.rs
🧰 Additional context used
🧬 Code Graph Analysis (3)
lib/segment/benches/vector_search.rs (2)
lib/segment/src/vector_storage/dense/simple_dense_vector_storage.rs (1)
  • open_simple_dense_vector_storage (92-108)
lib/segment/src/index/hnsw_index/point_scorer.rs (1)
  • new_for_test (99-111)
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
lib/segment/src/vector_storage/raw_scorer.rs (1)
  • raw_scorer_from_query_scorer (423-434)
lib/segment/src/index/hnsw_index/point_scorer.rs (4)
lib/segment/src/common/operation_error.rs (1)
  • check_process_stopped (242-247)
lib/segment/src/vector_storage/raw_scorer.rs (2)
  • check_deleted_condition (736-747)
  • new_raw_scorer (51-121)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
  • raw_scorer (166-180)
lib/common/common/src/fixed_length_priority_queue.rs (1)
  • top (80-82)
🔇 Additional comments (12)
lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs (2)

696-698: Updated interface aligns with refactored RawScorer.

The call to raw_scorer now omits the point_deleted parameter, reflecting the architectural change where filtering logic has been moved from RawScorer to FilteredScorer. This simplifies the scoring interface.


699-701: Interface update correctly implements the filtering logic refactor.

The removal of the point_deleted parameter from new_raw_scorer_for_test is consistent with the PR's objective of moving filtering responsibility from RawScorer to FilteredScorer. This change simplifies scoring component interfaces.

lib/segment/benches/vector_search.rs (3)

15-15: Updated imports to align with the refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct import and usage of new_raw_scorer_for_test.

Also applies to: 18-18


69-73: Direct usage of FilteredScorer simplifies benchmark code

The benchmark now uses FilteredScorer::new_for_test directly instead of new_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.


92-96: Simplified scorer construction in random access benchmark

Consistent change to use FilteredScorer::new_for_test directly, making the code more straightforward by removing the unwrapping that was previously needed when using new_raw_scorer_for_test.

lib/segment/src/index/plain_vector_index.rs (3)

9-9: Updated imports to align with refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct usage of new_raw_scorer.

Also applies to: 22-22


113-120: FilteredScorer now handles both scoring and filtering in a unified interface

The FilteredScorer::new constructor now takes additional parameters for filter context that were previously handled separately. This consolidation simplifies the search logic and aligns with the PR objective of moving filtering logic from RawScorer to FilteredScorer.


141-148: Consistent use of FilteredScorer for unfiltered searches

The same pattern is applied for unfiltered searches, using None for filter-related parameters. This unified approach maintains consistency and simplifies the codebase by having a single entry point for scorer creation.

lib/segment/benches/scorer_mmap.rs (2)

13-13: Updated imports to align with refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct import and usage of new_raw_scorer_for_test.

Also applies to: 16-16


66-70: Direct usage of FilteredScorer simplifies mmap benchmark code

The benchmark now uses FilteredScorer::new_for_test directly instead of new_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.

lib/segment/benches/fixture.rs (2)

8-8: Updated imports to align with refactored scorer architecture

The imports are simplified by removing FakeFilterContext and FilteredScorer, as these are now handled internally by TestRawScorerProducer.


63-63: Simplified scorer acquisition in graph building

The code now directly gets a scorer from the vector holder instead of the previous two-step process (creating a raw scorer and then wrapping it with a filtered scorer). This simplification makes the code more readable and aligns with the PR's goal of consolidating filter and scorer functionality.

@xzfc xzfc requested a review from Copilot April 22, 2025 23:23
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the scoring and filtering logic by moving several filtering responsibilities out of RawScorer and into FilteredScorer, resulting in a more streamlined scorer API and improved performance in certain code paths. Key changes include replacing new_raw_scorer calls with FilteredScorer::new (and its variants), updating methods such as peek_top_iter/peek_top_all, and adjusting tests and benchmarks to use the new API.

Reviewed Changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lib/segment/src/index/sparse_index/sparse_vector_index.rs Updated scorer construction and method calls to use FilteredScorer instead of RawScorer.
lib/segment/src/index/plain_vector_index.rs Replaced new_raw_scorer with FilteredScorer::new and removed unused filter parameters.
lib/segment/src/index/hnsw_index/* Across multiple files (tests, hnsw.rs, graph_layers, gpu modules), updated scorer construction and related API calls to use the new FilteredScorer interface.
lib/segment/src/index/hnsw_index/point_scorer.rs Modified FilteredScorer to internally filter point lists and return an iterator of scored points rather than a slice.
Various benchmarks and fixture files Adjusted scorer invocations to match the new API, removing legacy FakeFilterContext usage.

@xzfc xzfc merged commit 44341bb into dev Apr 23, 2025
20 checks passed
@xzfc xzfc deleted the separate-scorer-filterer branch April 23, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants