-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Heuristic optimization #6501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heuristic optimization #6501
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes the heuristic computation used during HNSW graph building by reducing redundant vector comparisons.
- Introduces a new LinksContainer abstraction to track processed links.
- Replaces the old LinkContainer type with LinksContainer across graph layers and GPU link updates.
- Updates heuristic connection logic and modifies related tests accordingly.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
lib/segment/src/index/hnsw_index/mod.rs | Added the new module for LinksContainer. |
lib/segment/src/index/hnsw_index/graph_layers_builder.rs | Replaced LinkContainer with LinksContainer and updated connection and heuristic logic. |
lib/segment/src/index/hnsw_index/gpu/gpu_links.rs | Modified code to use the new LinksContainer API when filtering and setting links. |
Comments suppressed due to low confidence (1)
lib/segment/src/index/hnsw_index/graph_layers_builder.rs:806
- The removal of tests for candidate selection and connection logic reduces test coverage for the new heuristic optimization. Please ensure that equivalent tests are added to validate the updated behavior.
// Removed tests for candidate selection heuristics and connect_new_point
📝 WalkthroughWalkthroughThe changes introduce a new Suggested reviewers
Tip ⚡️ Faster reviews with caching
Enjoy the performance boost—your workflow just got faster. ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)
9-13
: Consider usingusize
forprocessed_by_heuristic
The field is always cast back and forth between
usize
andu32
.
SinceVec::len()
already returnsusize
, keeping the same type avoids casts and the risk of silent truncation on very large graphs.- processed_by_heuristic: u32, + processed_by_heuristic: usize,A simple, non-breaking change (Rust will happily coerce
usize
intoNonZeroUsize
when needed).
157-159
: Avoid panicking on allocation failures
try_set_capacity_exact(...).unwrap()
will panic if memory cannot be reserved.
In a large-scale build this can abort the whole indexing process. Propagate the error (?
) or fall back to a slower but safe path instead of panicking.
11-12
: Minor wording nitpick in the doc comment“Amount of links that processed by the heuristic.” → “Number of links that have been processed by the heuristic.”
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
lib/segment/src/index/hnsw_index/gpu/gpu_links.rs
(3 hunks)lib/segment/src/index/hnsw_index/graph_layers_builder.rs
(15 hunks)lib/segment/src/index/hnsw_index/links_container.rs
(1 hunks)lib/segment/src/index/hnsw_index/mod.rs
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
links
(27-29)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: lint
- GitHub Check: storage-compat-test
🔇 Additional comments (23)
lib/segment/src/index/hnsw_index/mod.rs (1)
11-11
: New module declaration looks good.This new module declaration properly follows Rust conventions and integrates the new
LinksContainer
functionality seamlessly into the existing HNSW index structure.lib/segment/src/index/hnsw_index/gpu/gpu_links.rs (3)
190-191
: Correctly updated to use the new container API.The code now correctly accesses the links through the
.links()
method of theLinksContainer
, which properly encapsulates the internal representation of links.
200-200
: Correctly adapted to use the new container API.The code now passes the links accessed via the
.links()
method toset_links
, correctly integrating with the newLinksContainer
API.
263-270
: Simplified link update with new fill_from method.This change efficiently replaces the previous pattern of clearing and extending link vectors with a single
fill_from
call. This is more concise and likely more efficient since it avoids intermediate operations.lib/segment/src/index/hnsw_index/graph_layers_builder.rs (18)
7-8
: Import optimization looks good.Simplified the imports by removing unnecessary types that are no longer directly used after the refactoring.
15-15
: Correctly importing new container types.The import of
ItemsBuffer
andLinksContainer
from the new module is correctly added, making these types available for use in this file.
24-24
: Good type alias update for container abstraction.The
LockedLinkContainer
type is properly updated to use the newLinksContainer
struct, maintaining the existing locking pattern withRwLock
.
179-183
: Correctly using LinksContainer in initialization.The code now properly initializes the links layers with
LinksContainer
instances instead of raw vectors. The capacity is correctly set based on thereserve
parameter.
242-244
: Updated iteration with LinksContainer API.The code now correctly iterates over the links stored in the
LinksContainer
using the.iter()
method.
246-246
: Correctly using LinksContainer conversion.The code now properly converts from
LinksContainer
to a vector for processing using the.into_vec()
method.
287-287
: Properly using LinksContainer for level initialization.The code now correctly initializes a new level with a
LinksContainer
of the appropriate capacity instead of a raw vector.
357-357
: Simplified link addition with fill_from.Using
fill_from
to populate links from a source iterator is more concise and likely more efficient than the previous approach of manual extension.
435-437
: Efficiently using sorted heuristic filling.The code now uses the specialized
fill_from_sorted_with_heuristic
method to populate links based on a sorted iterator, which can optimize the heuristic pruning process by avoiding redundant comparisons.
440-440
: Good reuse of ItemsBuffer for heuristic optimization.Using a reusable
ItemsBuffer
instance can reduce memory allocations during the heuristic pruning process, potentially improving performance.
442-444
: Encapsulated heuristic connection logic.The code now delegates the complex heuristic connection logic to the
LinksContainer
through itsconnect_with_heuristic
method, which should improve code maintainability and encapsulation.
460-460
: Simplified connection with encapsulated method.Using the
connect
method ofLinksContainer
simplifies the code by encapsulating the link connection logic.
465-465
: Consistent use of LinksContainer API.The code consistently uses the
connect
method for the bidirectional link connections, ensuring symmetry in the graph connections.
482-482
: Correctly accessing links through container API.In the
get_average_connectivity_on_level
method, the code now correctly accesses the links through the.links()
method of theLinksContainer
.
636-636
: Updated test code for new container API.Test code has been properly updated to use the
.links()
method for accessing the links in theLinksContainer
.
699-699
: Test code correctly using new container API.The test code correctly retrieves links using
.links()
for comparison, ensuring that tests work correctly with the new container implementation.
729-729
: Consistently updated all test code.All test code has been consistently updated to use the new container API, ensuring that tests continue to function correctly after the refactoring.
1-807
: Overall refactoring significantly improves heuristic optimization.The comprehensive refactoring to use the new
LinksContainer
and its specialized methods for heuristic pruning aligns perfectly with the PR objective of optimizing heuristic computation during HNSW graph construction. The code now avoids redundant vector comparisons by tracking processed points in the container, which should lead to the performance improvements described in the PR summary (up to 38.8% faster graph building).The changes maintain the original behavior while making the code more maintainable and efficient. The consistent API usage across both the main implementation and GPU-specific code ensures that all parts of the system work together correctly.
lib/segment/src/index/hnsw_index/links_container.rs (1)
59-64
: Verify the comparison direction in the heuristic filter
score(candidate.idx, existing) > candidate.score
assumes that a larger score means closer (cosine similarity style).
IfScoreType
is a distance (smaller = closer, asEuclidMetric
typically is), the comparison must be inverted (<
) or the heuristic will reject all valid neighbours and degrade recall.Please double-check the sign convention used across the codebase.
This commit extracts Vec<PointOffsetType> and heuristic related functions from graph_layers_builder.rs into a new file. No significant changes in logic are made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
lib/segment/src/index/hnsw_index/links_container.rs (2)
66-71
:⚠️ Potential issueFix
processed_by_heuristic
not being updated on early returnWhen the container fills up to
level_m
, the function returns early without updatingprocessed_by_heuristic
, which can cause subsequent heuristic optimizations to be ineffective.self.links.push(candidate.idx); if self.links.len() >= level_m { + self.processed_by_heuristic = level_m as u32; return; } } self.processed_by_heuristic = self.links.len() as u32;
96-102
: 🛠️ Refactor suggestionCheck for duplicates before insertion in
connect
The current implementation may insert duplicates of the same point ID, which would waste capacity and potentially violate the assumption that each neighbor appears only once.
+ // Check if new_point_id is already in the container + if self.links.contains(&new_point_id) { + return; + } + if self.links.len() < level_m { self.links.insert(id_to_insert, new_point_id); } else if id_to_insert != self.links.len() { self.links.pop(); self.links.insert(id_to_insert, new_point_id); }
🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)
139-206
: Add more documentation to the complexconnect_with_heuristic
methodThis method implements a non-trivial optimization using the
processed_by_heuristic
tracking. Consider adding more comments explaining:
- How the
order
field is used to track which items have already been processed by the heuristic- The optimization strategy that avoids redundant score calculations
- The relationship between this implementation and the reference implementation
248-250
: Consider documenting why this test is ignoredThe
test_candidate_selection_heuristics
test is marked with#[ignore]
, but it's not clear why. If this is an important test for validating heuristic logic, consider enabling it or documenting why it's ignored.
209-231
: Consider a more explicit approach to score cachingThe
Item
struct uses interior mutability withCell<Option<ScoreType>>
for score caching, which adds complexity. Consider a more explicit approach to make the code more readable:
- Either compute all scores upfront, or
- Use a separate cache structure that maps point pairs to scores
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs
(2 hunks)lib/segment/src/index/hnsw_index/gpu/gpu_links.rs
(3 hunks)lib/segment/src/index/hnsw_index/gpu/mod.rs
(1 hunks)lib/segment/src/index/hnsw_index/graph_layers_builder.rs
(15 hunks)lib/segment/src/index/hnsw_index/links_container.rs
(1 hunks)lib/segment/src/index/hnsw_index/mod.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- lib/segment/src/index/hnsw_index/gpu/gpu_links.rs
- lib/segment/src/index/hnsw_index/mod.rs
- lib/segment/src/index/hnsw_index/graph_layers_builder.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
with_capacity
(16-21)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: integration-tests
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: storage-compat-test
- GitHub Check: lint
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (7)
lib/segment/src/index/hnsw_index/gpu/mod.rs (2)
203-205
: LGTM: Updated to use the new LinksContainer APIThe code correctly adapts to use the new
links()
method from theLinksContainer
abstraction instead of directly cloning the entire lock guard.
207-209
: LGTM: Consistent API usageThis change correctly mirrors the same pattern as above, maintaining consistency in how link data is accessed.
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (3)
477-477
: LGTM: Added import for new abstractionCorrectly added the import for the new
LinksContainer
type.
977-982
: LGTM: Improved heuristic implementation using LinksContainerThe code now properly uses the new
LinksContainer
abstraction for heuristic selection instead of the previously removed static method fromGraphLayersBuilder
.
984-984
: LGTM: Simplified iteration over heuristic resultsThe iteration is now more concise, directly using the
LinksContainer
's iterator.lib/segment/src/index/hnsw_index/links_container.rs (2)
9-13
: LGTM: Clear structure with good documentationThe
LinksContainer
struct is well-defined with clear purpose and documentation. The fieldprocessed_by_heuristic
is properly explained to track heuristic optimization state.
16-44
: LGTM: Well-designed basic container operationsThe basic operations (
with_capacity
,push
,links
,iter
,into_vec
,fill_from
) provide a complete and consistent API for managing links.
GPU test is green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
lib/segment/src/index/hnsw_index/links_container.rs (3)
249-250
: Document reason for ignoring testThe test is marked with
#[ignore]
without explaining why. Add a comment to clarify why this test is ignored by default (e.g., if it's time-consuming, resource-intensive, or only needed for specific debugging).#[test] #[ignore] + // This test is ignored by default because it's primarily for manual inspection/debugging + // of heuristic behavior rather than automated verification fn test_candidate_selection_heuristics() {
174-181
: Complex sorting logic may benefit from commentsThe sorting logic that prioritizes items with order information while falling back to score comparison for other items is a key part of the optimization, but it's complex enough to warrant an explanatory comment about why this approach is used.
items.0.sort_unstable_by(|a, b| { + // Prioritize items with known order (already processed by heuristic) + // to preserve their relative ordering and avoid redundant comparisons if let (Some(a_order), Some(b_order)) = (a.order, b.order) { return a_order.cmp(&b_order); } b.score(target_point_id, &mut score) .total_cmp(&a.score(target_point_id, &mut score)) });
188-196
: Optimize heuristic filtering for pre-processed itemsThe optimization logic at lines 189-191 that skips heuristic checks for previously processed items is central to the performance improvement described in the PR, but it's not immediately obvious how this affects algorithm behavior. A brief comment would help future maintainers understand this key optimization.
for existing in &items.0[0..write] { + // Skip re-validating pairs that have already been processed by the heuristic + // This is the key optimization that reduces O(n²) vector comparisons if candidate.order.is_some() && existing.order.is_some() { continue; } if score(candidate.idx, existing.idx) > candidate.score(target_point_id, &mut score) { continue 'outer; } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/segment/src/index/hnsw_index/links_container.rs
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (4)
lib/segment/src/index/hnsw_index/links_container.rs (4)
107-108
: Good use of test-only implementationThe reference implementation with
#[cfg(test)]
attribute is a great practice for maintaining a clear, simple version of complex algorithms for testing and validation purposes.
213-232
: Excellent optimization with cached scoresThe caching of computed scores using interior mutability in the
Item
struct effectively prevents redundant computation when the same score is needed multiple times during sorting and processing, which aligns well with the PR's performance optimization goals.
430-431
: Strong test validation strategyThe iterative comparison between the optimized and reference implementations for each candidate is a robust way to ensure that the optimizations don't alter the correctness of the algorithm, which is essential for maintaining consistent search behavior after this change.
46-71
: 🛠️ Refactor suggestionOptimize early exit path to maintain heuristic processing state
When the container reaches
level_m
capacity (line 66), the function breaks out of the loop without updatingprocessed_by_heuristic
, which only happens at line 70 outside the loop. This could lead to incorrect optimization assumptions in subsequent operations.self.links.push(candidate.idx); if self.links.len() >= level_m { + self.processed_by_heuristic = self.links.len() as u32; break; } } self.processed_by_heuristic = self.links.len() as u32;
Likely an incorrect or invalid review comment.
if self.links.len() < level_m { | ||
self.links.push(new_point_id); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing update to processed_by_heuristic
when adding link in non-full container
When the container isn't full, we push a new link but exit without updating processed_by_heuristic
. This could lead to inefficient processing in future heuristic operations as we don't track that these links haven't been processed by the heuristic.
if self.links.len() < level_m {
self.links.push(new_point_id);
+ // Keep processed_by_heuristic to indicate early links haven't been heuristically validated
return;
}
Committable suggestion skipped: line range outside the PR's diff.
if let (Some(a_order), Some(b_order)) = (a.order, b.order) { | ||
return a_order.cmp(&b_order); | ||
} | ||
b.score(target_point_id, &mut score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we assume that re-computing scores for already inserted points once is negligible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I decided to not to preserve because it would impose 2x RAM increase (see "Further improvements" point 1 in the PR description). If we decide to save it, I suggest to have it in a separate PR.
Also, .score()
is lazy/cached, the second call would return the saved result.
And calling .score()
is skipped if both a
and b
are already processed by heuristics.
self.links.clear(); | ||
|
||
let mut write = 0; | ||
'outer: for read in 0..items.0.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not trivial to read this part. I am not sure which kind of comments would help here, but I had to use copilot to generate some explanations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an explanation comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
152-155
: 🛠️ Refactor suggestionMissing update to
processed_by_heuristic
when adding link in non-full container.When the container isn't full, we push a new link but exit without updating
processed_by_heuristic
. This could lead to inefficient processing in future heuristic operations as we don't track that these links haven't been processed by the heuristic.
🧹 Nitpick comments (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
84-84
: Consider evaluating binary search ROI.The TODO comment suggests using binary search for insertion position. Since this is in a performance-critical path, it would be worth benchmarking whether binary search would provide meaningful improvements for typical link collection sizes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/segment/src/index/hnsw_index/links_container.rs
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
lib/segment/src/index/hnsw_index/links_container.rs (1)
Learnt from: xzfc
PR: qdrant/qdrant#6501
File: lib/segment/src/index/hnsw_index/links_container.rs:158-158
Timestamp: 2025-05-08T16:39:07.893Z
Learning: In the Qdrant codebase, allocation failures are generally not handled (allowed to panic) except in very few specific cases. Most `unwrap()` calls on allocation operations are intentional.
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: lint
- GitHub Check: storage-compat-test
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: test-consensus-compose
🔇 Additional comments (6)
lib/segment/src/index/hnsw_index/links_container.rs (6)
1-13
: Clear data structure design with effective tracking mechanism.The
LinksContainer
struct's design is well-aligned with the PR's optimization goal. Theprocessed_by_heuristic
counter is key to tracking which links have already been processed by the heuristic algorithm, avoiding redundant vector comparisons.
46-71
: Effective implementation of greedy heuristic selection.The implementation correctly selects candidates that aren't dominated by other candidates based on a scoring function. This is a key optimization in HNSW graph construction. The method also properly updates
processed_by_heuristic
to track the processed links.
187-199
: Good explanation of algorithm optimization.The comments effectively explain the two key optimizations in this implementation:
- Skipping score checks for items already processed by the heuristic
- Using in-place updates to avoid separate input and output containers
This clearly shows the algorithmic improvements that drive the performance gains reported in the PR.
201-220
: Well-implemented in-place processing algorithm.The in-place reading and updating algorithm is clever - it uses separate read and write pointers to process the items buffer without allocating additional memory. This contributes to the performance improvements described in the PR by reducing memory operations.
238-250
: Efficient score caching with interior mutability.The
cached_score
method smartly uses interior mutability viaCell
to compute scores only once and cache them. This is a key optimization that helps reduce redundant vector comparisons, which directly addresses the PR's main goal.
392-451
: Thorough testing ensures correctness of optimized implementation.The test performs extensive verification (1000 iterations) comparing the optimized
connect_with_heuristic
method against the reference implementation, ensuring that the optimization doesn't affect correctness. This is crucial for validating that performance improvements don't come at the cost of accuracy.
* Refactor out LinksContainer This commit extracts Vec<PointOffsetType> and heuristic related functions from graph_layers_builder.rs into a new file. No significant changes in logic are made. * Optimize LinksContainer * Add illustration * Fixups * Add more comments and rename `score` to `cached_score`
This PR optimizes heuristic computation.
When building the HNSW graph (particularly when adding incoming links), the following loop pattern happen:
A few links are added to the container, then this container is pruned using the heuristic and the loop starts again.
Illustrated by pseudo code:
Running the heuristic requires
O(n*n)
vector comparisons, thus this part of the code could take a significant percentage of the graph building time.However, some vector comparisons could be avoided if part of the input has already been processed by the heuristic.
This PR implements this optimization by introducing a new struct
LinksContainer
that tracks the number of points that are processed by the heuristic and uses this information to speed up the next heuristic call.A similar change was proposed two years ago in #1965 and #1968, and described in a notion page.
Performance
Graph building time measured for a single segment (not the full dataset). Times in seconds.
dev
Further improvements
Some ideas (not implemented in this PR):
Cache scores, i.e. add
scores: Vec<ScoreType>
intoLinksContainer
. I expect it to cut the number of comparisons by 33% at the cost of using 2x RAM.Not implemented in this PR as it imposes a different trade-off. (as opposed to current implementation which I believe is strictly better)
Use smarter way to merge old and new scores. In this PR, I use regular sort with comparison function that could skip scores calculation when comparing two old links (src). This slightly reduces amount of comparisons, but still not ideal. A better approach would be to use merging/sorting algorithm that minimizes number of comparisons. E.g. adding exactly one link should result in a binary search. Check the description of
binary-merge
crate. (I'm not proposing to use this particular crate, but their problem statement perfectly describes what I mean).Not implemented in this PR as I think it would be an overkill at this moment.