Skip to content

Conversation

xzfc
Copy link
Contributor

@xzfc xzfc commented May 7, 2025

This PR optimizes heuristic computation.

When building the HNSW graph (particularly when adding incoming links), the following loop pattern happen:
A few links are added to the container, then this container is pruned using the heuristic and the loop starts again.
Illustrated by pseudo code:

links += new_points
links = heuristic(links)
links += new_points
links = heuristic(links)
…

Running the heuristic requires O(n*n) vector comparisons, thus this part of the code could take a significant percentage of the graph building time.

However, some vector comparisons could be avoided if part of the input has already been processed by the heuristic.
This PR implements this optimization by introducing a new struct LinksContainer that tracks the number of points that are processed by the heuristic and uses this information to speed up the next heuristic call.

A similar change was proposed two years ago in #1965 and #1968, and described in a notion page.

Performance

Graph building time measured for a single segment (not the full dataset). Times in seconds.

Dataset Quantization Points dev This PR Change
dbpedia 25.6k 7.49 7.38 -1.5 %
dbpedia 243.8k 120.00 110.93 -7.6 %
dbpedia Binary 243.8k 39.71 34.82 -12.3 %
dbpedia Scalar u8 0.99 243.8k 66.14 60.51 -8.5 %
dbpedia Product x16 243.8k 1289.36 789.30 -38.8 %
glove 100 300k 57.40 50.84 -11.4 %
random 128 100k 39.16 24.68 -37.0 %

Further improvements

Some ideas (not implemented in this PR):

  1. Cache scores, i.e. add scores: Vec<ScoreType> into LinksContainer. I expect it to cut the number of comparisons by 33% at the cost of using 2x RAM.

    Not implemented in this PR as it imposes a different trade-off. (as opposed to current implementation which I believe is strictly better)

  2. Use smarter way to merge old and new scores. In this PR, I use regular sort with comparison function that could skip scores calculation when comparing two old links (src). This slightly reduces amount of comparisons, but still not ideal. A better approach would be to use merging/sorting algorithm that minimizes number of comparisons. E.g. adding exactly one link should result in a binary search. Check the description of binary-merge crate. (I'm not proposing to use this particular crate, but their problem statement perfectly describes what I mean).

    Not implemented in this PR as I think it would be an overkill at this moment.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the heuristic computation used during HNSW graph building by reducing redundant vector comparisons.

  • Introduces a new LinksContainer abstraction to track processed links.
  • Replaces the old LinkContainer type with LinksContainer across graph layers and GPU link updates.
  • Updates heuristic connection logic and modifies related tests accordingly.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
lib/segment/src/index/hnsw_index/mod.rs Added the new module for LinksContainer.
lib/segment/src/index/hnsw_index/graph_layers_builder.rs Replaced LinkContainer with LinksContainer and updated connection and heuristic logic.
lib/segment/src/index/hnsw_index/gpu/gpu_links.rs Modified code to use the new LinksContainer API when filtering and setting links.
Comments suppressed due to low confidence (1)

lib/segment/src/index/hnsw_index/graph_layers_builder.rs:806

  • The removal of tests for candidate selection and connection logic reduces test coverage for the new heuristic optimization. Please ensure that equivalent tests are added to validate the updated behavior.
    // Removed tests for candidate selection heuristics and connect_new_point

@xzfc xzfc requested review from generall and IvanPleshkov May 8, 2025 02:13
Copy link
Contributor

coderabbitai bot commented May 8, 2025

📝 Walkthrough

Walkthrough

The changes introduce a new LinksContainer struct that manages collections of point links with heuristic pruning for HNSW index construction. The GraphLayersBuilder is refactored to replace manual vector-based link manipulation and heuristic candidate selection with calls to LinksContainer methods. The GPU-related code is updated to access link slices directly from the new container, improving how links are uploaded and downloaded. A new private module links_container is added to the hnsw_index module. Several internal static methods and test functions related to heuristic selection are removed, and test code is updated to use the new LinksContainer API. Unused imports are also cleaned up.

Suggested reviewers

  • IvanPleshkov
  • generall

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)

9-13: Consider using usize for processed_by_heuristic

The field is always cast back and forth between usize and u32.
Since Vec::len() already returns usize, keeping the same type avoids casts and the risk of silent truncation on very large graphs.

-    processed_by_heuristic: u32,
+    processed_by_heuristic: usize,

A simple, non-breaking change (Rust will happily coerce usize into NonZeroUsize when needed).


157-159: Avoid panicking on allocation failures

try_set_capacity_exact(...).unwrap() will panic if memory cannot be reserved.
In a large-scale build this can abort the whole indexing process. Propagate the error (?) or fall back to a slower but safe path instead of panicking.


11-12: Minor wording nitpick in the doc comment

“Amount of links that processed by the heuristic.” → “Number of links that have been processed by the heuristic.”

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f0b48a and 6bc7adb.

📒 Files selected for processing (4)
  • lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (3 hunks)
  • lib/segment/src/index/hnsw_index/graph_layers_builder.rs (15 hunks)
  • lib/segment/src/index/hnsw_index/links_container.rs (1 hunks)
  • lib/segment/src/index/hnsw_index/mod.rs (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
  • links (27-29)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
🔇 Additional comments (23)
lib/segment/src/index/hnsw_index/mod.rs (1)

11-11: New module declaration looks good.

This new module declaration properly follows Rust conventions and integrates the new LinksContainer functionality seamlessly into the existing HNSW index structure.

lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (3)

190-191: Correctly updated to use the new container API.

The code now correctly accesses the links through the .links() method of the LinksContainer, which properly encapsulates the internal representation of links.


200-200: Correctly adapted to use the new container API.

The code now passes the links accessed via the .links() method to set_links, correctly integrating with the new LinksContainer API.


263-270: Simplified link update with new fill_from method.

This change efficiently replaces the previous pattern of clearing and extending link vectors with a single fill_from call. This is more concise and likely more efficient since it avoids intermediate operations.

lib/segment/src/index/hnsw_index/graph_layers_builder.rs (18)

7-8: Import optimization looks good.

Simplified the imports by removing unnecessary types that are no longer directly used after the refactoring.


15-15: Correctly importing new container types.

The import of ItemsBuffer and LinksContainer from the new module is correctly added, making these types available for use in this file.


24-24: Good type alias update for container abstraction.

The LockedLinkContainer type is properly updated to use the new LinksContainer struct, maintaining the existing locking pattern with RwLock.


179-183: Correctly using LinksContainer in initialization.

The code now properly initializes the links layers with LinksContainer instances instead of raw vectors. The capacity is correctly set based on the reserve parameter.


242-244: Updated iteration with LinksContainer API.

The code now correctly iterates over the links stored in the LinksContainer using the .iter() method.


246-246: Correctly using LinksContainer conversion.

The code now properly converts from LinksContainer to a vector for processing using the .into_vec() method.


287-287: Properly using LinksContainer for level initialization.

The code now correctly initializes a new level with a LinksContainer of the appropriate capacity instead of a raw vector.


357-357: Simplified link addition with fill_from.

Using fill_from to populate links from a source iterator is more concise and likely more efficient than the previous approach of manual extension.


435-437: Efficiently using sorted heuristic filling.

The code now uses the specialized fill_from_sorted_with_heuristic method to populate links based on a sorted iterator, which can optimize the heuristic pruning process by avoiding redundant comparisons.


440-440: Good reuse of ItemsBuffer for heuristic optimization.

Using a reusable ItemsBuffer instance can reduce memory allocations during the heuristic pruning process, potentially improving performance.


442-444: Encapsulated heuristic connection logic.

The code now delegates the complex heuristic connection logic to the LinksContainer through its connect_with_heuristic method, which should improve code maintainability and encapsulation.


460-460: Simplified connection with encapsulated method.

Using the connect method of LinksContainer simplifies the code by encapsulating the link connection logic.


465-465: Consistent use of LinksContainer API.

The code consistently uses the connect method for the bidirectional link connections, ensuring symmetry in the graph connections.


482-482: Correctly accessing links through container API.

In the get_average_connectivity_on_level method, the code now correctly accesses the links through the .links() method of the LinksContainer.


636-636: Updated test code for new container API.

Test code has been properly updated to use the .links() method for accessing the links in the LinksContainer.


699-699: Test code correctly using new container API.

The test code correctly retrieves links using .links() for comparison, ensuring that tests work correctly with the new container implementation.


729-729: Consistently updated all test code.

All test code has been consistently updated to use the new container API, ensuring that tests continue to function correctly after the refactoring.


1-807: Overall refactoring significantly improves heuristic optimization.

The comprehensive refactoring to use the new LinksContainer and its specialized methods for heuristic pruning aligns perfectly with the PR objective of optimizing heuristic computation during HNSW graph construction. The code now avoids redundant vector comparisons by tracking processed points in the container, which should lead to the performance improvements described in the PR summary (up to 38.8% faster graph building).

The changes maintain the original behavior while making the code more maintainable and efficient. The consistent API usage across both the main implementation and GPU-specific code ensures that all parts of the system work together correctly.

lib/segment/src/index/hnsw_index/links_container.rs (1)

59-64: Verify the comparison direction in the heuristic filter

score(candidate.idx, existing) > candidate.score assumes that a larger score means closer (cosine similarity style).
If ScoreType is a distance (smaller = closer, as EuclidMetric typically is), the comparison must be inverted (<) or the heuristic will reject all valid neighbours and degrade recall.

Please double-check the sign convention used across the codebase.

xzfc added 2 commits May 8, 2025 02:23
This commit extracts Vec<PointOffsetType> and heuristic related
functions from graph_layers_builder.rs into a new file.

No significant changes in logic are made.
@xzfc xzfc force-pushed the heuristic-opt branch from 6bc7adb to b40f1e9 Compare May 8, 2025 02:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
lib/segment/src/index/hnsw_index/links_container.rs (2)

66-71: ⚠️ Potential issue

Fix processed_by_heuristic not being updated on early return

When the container fills up to level_m, the function returns early without updating processed_by_heuristic, which can cause subsequent heuristic optimizations to be ineffective.

            self.links.push(candidate.idx);
            if self.links.len() >= level_m {
+                self.processed_by_heuristic = level_m as u32;
                return;
            }
        }
        self.processed_by_heuristic = self.links.len() as u32;

96-102: 🛠️ Refactor suggestion

Check for duplicates before insertion in connect

The current implementation may insert duplicates of the same point ID, which would waste capacity and potentially violate the assumption that each neighbor appears only once.

+        // Check if new_point_id is already in the container
+        if self.links.contains(&new_point_id) {
+            return;
+        }
+
        if self.links.len() < level_m {
            self.links.insert(id_to_insert, new_point_id);
        } else if id_to_insert != self.links.len() {
            self.links.pop();
            self.links.insert(id_to_insert, new_point_id);
        }
🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)

139-206: Add more documentation to the complex connect_with_heuristic method

This method implements a non-trivial optimization using the processed_by_heuristic tracking. Consider adding more comments explaining:

  1. How the order field is used to track which items have already been processed by the heuristic
  2. The optimization strategy that avoids redundant score calculations
  3. The relationship between this implementation and the reference implementation

248-250: Consider documenting why this test is ignored

The test_candidate_selection_heuristics test is marked with #[ignore], but it's not clear why. If this is an important test for validating heuristic logic, consider enabling it or documenting why it's ignored.


209-231: Consider a more explicit approach to score caching

The Item struct uses interior mutability with Cell<Option<ScoreType>> for score caching, which adds complexity. Consider a more explicit approach to make the code more readable:

  1. Either compute all scores upfront, or
  2. Use a separate cache structure that maps point pairs to scores
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6bc7adb and b40f1e9.

📒 Files selected for processing (6)
  • lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (2 hunks)
  • lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (3 hunks)
  • lib/segment/src/index/hnsw_index/gpu/mod.rs (1 hunks)
  • lib/segment/src/index/hnsw_index/graph_layers_builder.rs (15 hunks)
  • lib/segment/src/index/hnsw_index/links_container.rs (1 hunks)
  • lib/segment/src/index/hnsw_index/mod.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • lib/segment/src/index/hnsw_index/gpu/gpu_links.rs
  • lib/segment/src/index/hnsw_index/mod.rs
  • lib/segment/src/index/hnsw_index/graph_layers_builder.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
  • with_capacity (16-21)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (7)
lib/segment/src/index/hnsw_index/gpu/mod.rs (2)

203-205: LGTM: Updated to use the new LinksContainer API

The code correctly adapts to use the new links() method from the LinksContainer abstraction instead of directly cloning the entire lock guard.


207-209: LGTM: Consistent API usage

This change correctly mirrors the same pattern as above, maintaining consistency in how link data is accessed.

lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (3)

477-477: LGTM: Added import for new abstraction

Correctly added the import for the new LinksContainer type.


977-982: LGTM: Improved heuristic implementation using LinksContainer

The code now properly uses the new LinksContainer abstraction for heuristic selection instead of the previously removed static method from GraphLayersBuilder.


984-984: LGTM: Simplified iteration over heuristic results

The iteration is now more concise, directly using the LinksContainer's iterator.

lib/segment/src/index/hnsw_index/links_container.rs (2)

9-13: LGTM: Clear structure with good documentation

The LinksContainer struct is well-defined with clear purpose and documentation. The field processed_by_heuristic is properly explained to track heuristic optimization state.


16-44: LGTM: Well-designed basic container operations

The basic operations (with_capacity, push, links, iter, into_vec, fill_from) provide a complete and consistent API for managing links.

@IvanPleshkov
Copy link
Contributor

GPU test is green
https://github.com/qdrant/qdrant/actions/runs/14903570575
It means not only GPU is fine. It also means that cpu-gpu equivalency is green and this PR didn't change heuristic result

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)

249-250: Document reason for ignoring test

The test is marked with #[ignore] without explaining why. Add a comment to clarify why this test is ignored by default (e.g., if it's time-consuming, resource-intensive, or only needed for specific debugging).

    #[test]
    #[ignore]
+   // This test is ignored by default because it's primarily for manual inspection/debugging
+   // of heuristic behavior rather than automated verification
    fn test_candidate_selection_heuristics() {

174-181: Complex sorting logic may benefit from comments

The sorting logic that prioritizes items with order information while falling back to score comparison for other items is a key part of the optimization, but it's complex enough to warrant an explanatory comment about why this approach is used.

        items.0.sort_unstable_by(|a, b| {
+           // Prioritize items with known order (already processed by heuristic)
+           // to preserve their relative ordering and avoid redundant comparisons
            if let (Some(a_order), Some(b_order)) = (a.order, b.order) {
                return a_order.cmp(&b_order);
            }
            b.score(target_point_id, &mut score)
                .total_cmp(&a.score(target_point_id, &mut score))
        });

188-196: Optimize heuristic filtering for pre-processed items

The optimization logic at lines 189-191 that skips heuristic checks for previously processed items is central to the performance improvement described in the PR, but it's not immediately obvious how this affects algorithm behavior. A brief comment would help future maintainers understand this key optimization.

            for existing in &items.0[0..write] {
+               // Skip re-validating pairs that have already been processed by the heuristic
+               // This is the key optimization that reduces O(n²) vector comparisons
                if candidate.order.is_some() && existing.order.is_some() {
                    continue;
                }
                if score(candidate.idx, existing.idx) > candidate.score(target_point_id, &mut score)
                {
                    continue 'outer;
                }
            }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b40f1e9 and af48ef3.

📒 Files selected for processing (1)
  • lib/segment/src/index/hnsw_index/links_container.rs (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
🔇 Additional comments (4)
lib/segment/src/index/hnsw_index/links_container.rs (4)

107-108: Good use of test-only implementation

The reference implementation with #[cfg(test)] attribute is a great practice for maintaining a clear, simple version of complex algorithms for testing and validation purposes.


213-232: Excellent optimization with cached scores

The caching of computed scores using interior mutability in the Item struct effectively prevents redundant computation when the same score is needed multiple times during sorting and processing, which aligns well with the PR's performance optimization goals.


430-431: Strong test validation strategy

The iterative comparison between the optimized and reference implementations for each candidate is a robust way to ensure that the optimizations don't alter the correctness of the algorithm, which is essential for maintaining consistent search behavior after this change.


46-71: 🛠️ Refactor suggestion

Optimize early exit path to maintain heuristic processing state

When the container reaches level_m capacity (line 66), the function breaks out of the loop without updating processed_by_heuristic, which only happens at line 70 outside the loop. This could lead to incorrect optimization assumptions in subsequent operations.

            self.links.push(candidate.idx);
            if self.links.len() >= level_m {
+               self.processed_by_heuristic = self.links.len() as u32;
                break;
            }
        }
        self.processed_by_heuristic = self.links.len() as u32;

Likely an incorrect or invalid review comment.

Comment on lines +152 to +155
if self.links.len() < level_m {
self.links.push(new_point_id);
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing update to processed_by_heuristic when adding link in non-full container

When the container isn't full, we push a new link but exit without updating processed_by_heuristic. This could lead to inefficient processing in future heuristic operations as we don't track that these links haven't been processed by the heuristic.

        if self.links.len() < level_m {
            self.links.push(new_point_id);
+           // Keep processed_by_heuristic to indicate early links haven't been heuristically validated
            return;
        }

Committable suggestion skipped: line range outside the PR's diff.

if let (Some(a_order), Some(b_order)) = (a.order, b.order) {
return a_order.cmp(&b_order);
}
b.score(target_point_id, &mut score)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we assume that re-computing scores for already inserted points once is negligible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I decided to not to preserve because it would impose 2x RAM increase (see "Further improvements" point 1 in the PR description). If we decide to save it, I suggest to have it in a separate PR.

Also, .score() is lazy/cached, the second call would return the saved result.

And calling .score() is skipped if both a and b are already processed by heuristics.

self.links.clear();

let mut write = 0;
'outer: for read in 0..items.0.len() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not trivial to read this part. I am not sure which kind of comments would help here, but I had to use copilot to generate some explanations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanation comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)

152-155: 🛠️ Refactor suggestion

Missing update to processed_by_heuristic when adding link in non-full container.

When the container isn't full, we push a new link but exit without updating processed_by_heuristic. This could lead to inefficient processing in future heuristic operations as we don't track that these links haven't been processed by the heuristic.

🧹 Nitpick comments (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)

84-84: Consider evaluating binary search ROI.

The TODO comment suggests using binary search for insertion position. Since this is in a performance-critical path, it would be worth benchmarking whether binary search would provide meaningful improvements for typical link collection sizes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af48ef3 and 4cdd6cd.

📒 Files selected for processing (1)
  • lib/segment/src/index/hnsw_index/links_container.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
Learnt from: xzfc
PR: qdrant/qdrant#6501
File: lib/segment/src/index/hnsw_index/links_container.rs:158-158
Timestamp: 2025-05-08T16:39:07.893Z
Learning: In the Qdrant codebase, allocation failures are generally not handled (allowed to panic) except in very few specific cases. Most `unwrap()` calls on allocation operations are intentional.
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
🔇 Additional comments (6)
lib/segment/src/index/hnsw_index/links_container.rs (6)

1-13: Clear data structure design with effective tracking mechanism.

The LinksContainer struct's design is well-aligned with the PR's optimization goal. The processed_by_heuristic counter is key to tracking which links have already been processed by the heuristic algorithm, avoiding redundant vector comparisons.


46-71: Effective implementation of greedy heuristic selection.

The implementation correctly selects candidates that aren't dominated by other candidates based on a scoring function. This is a key optimization in HNSW graph construction. The method also properly updates processed_by_heuristic to track the processed links.


187-199: Good explanation of algorithm optimization.

The comments effectively explain the two key optimizations in this implementation:

  1. Skipping score checks for items already processed by the heuristic
  2. Using in-place updates to avoid separate input and output containers

This clearly shows the algorithmic improvements that drive the performance gains reported in the PR.


201-220: Well-implemented in-place processing algorithm.

The in-place reading and updating algorithm is clever - it uses separate read and write pointers to process the items buffer without allocating additional memory. This contributes to the performance improvements described in the PR by reducing memory operations.


238-250: Efficient score caching with interior mutability.

The cached_score method smartly uses interior mutability via Cell to compute scores only once and cache them. This is a key optimization that helps reduce redundant vector comparisons, which directly addresses the PR's main goal.


392-451: Thorough testing ensures correctness of optimized implementation.

The test performs extensive verification (1000 iterations) comparing the optimized connect_with_heuristic method against the reference implementation, ensuring that the optimization doesn't affect correctness. This is crucial for validating that performance improvements don't come at the cost of accuracy.

@xzfc xzfc merged commit 998675e into dev May 9, 2025
17 checks passed
@xzfc xzfc deleted the heuristic-opt branch May 9, 2025 21:48
@coderabbitai coderabbitai bot mentioned this pull request May 19, 2025
generall pushed a commit that referenced this pull request May 22, 2025
* Refactor out LinksContainer

This commit extracts Vec<PointOffsetType> and heuristic related
functions from graph_layers_builder.rs into a new file.

No significant changes in logic are made.

* Optimize LinksContainer

* Add illustration

* Fixups

* Add more comments and rename `score` to `cached_score`
This was referenced Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants