Skip to content

Conversation

JojiiOfficial
Copy link
Contributor

@JojiiOfficial JojiiOfficial commented Jun 4, 2025

Fixes extremely high memory usage in Distance Matrix API on certain datasets with a large sample rate (>= ~50k).

Dev:
image

PR:
image

@JojiiOfficial JojiiOfficial force-pushed the fix_memory_leak_in_distance_matrix branch from c30253d to df24f78 Compare June 4, 2025 16:32
@qdrant qdrant deleted a comment from coderabbitai bot Jun 4, 2025
coderabbitai[bot]

This comment was marked as resolved.

@JojiiOfficial JojiiOfficial force-pushed the fix_memory_leak_in_distance_matrix branch from df24f78 to 2beb70e Compare June 4, 2025 16:36
@JojiiOfficial JojiiOfficial force-pushed the fix_memory_leak_in_distance_matrix branch from 2beb70e to 8e0cbf0 Compare June 4, 2025 16:37

This comment was marked as resolved.

Copy link
Member

@KShivendu KShivendu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Can you please share script to repro the bug/results? :)

@JojiiOfficial JojiiOfficial changed the title Fix memory leak in distance matrix Fix enormous memory usage in Distance Matrix API on high sample size Jun 5, 2025
@@ -2608,7 +2609,7 @@ impl From<JsonPath> for IsEmptyCondition {
#[derive(Debug, Deserialize, Serialize, JsonSchema, Clone, PartialEq, Eq)]
pub struct HasIdCondition {
#[schemars(schema_with = "HashSet::<PointIdType>::json_schema")]
pub has_id: AHashSet<PointIdType>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not always have an Arc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer lies here:

/// Threshold determining when to use an `Arc` in `HasIdCondition` if the condition includes many points.
/// Since we're cloning filters quite a lot, using an Arc for larger conditions reduces risk of memory leaks
/// and potentially improves performance in some places.
const HAS_ID_CONDITION_ARC_THRESHOLD: usize = 1_000;

meaning, a balance between performance and memory usage.

Though, I'm also curious how always using an Arc behaves. @JojiiOfficial Would it be possible to benchmark it?

Copy link
Contributor Author

@JojiiOfficial JojiiOfficial Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a comparison between cloning raw+arc and creating an arc using Arc::new() of the same hashset used in each iteration.
image

I suggest that we set the HAS_ID_CONDITION_ARC_THRESHOLD to something around 60 (probably 64 due to power of 2 allocation logic of rust) to not introduce overhead in places where we might create a lot of small filters without much cloning. Otherwise we might introduce an unexpected performance regression.

Wdyt?

Arc(Arc<T>),
}

impl<T> MaybeArc<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice to have would be to choose arc/no_arc automatically based on the size_of_value() of the inner value as a constructor function of this type.

Based on the provided chart, and considering ExtendedPointId is 24 bytes, the threshold could be around 1536 bytes (24 * 64)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_of_value doesn't work with HashSet though. When using MaybeArc wrapped around HashMaps or Vec, size_of_value would only return the stack size of the type, which is always constant regardless of the amount of items.
Therefore we can only apply this heuristic at each usage of MaybeArc instead inside a constructor, sadly.

@JojiiOfficial JojiiOfficial merged commit 64364f3 into dev Jun 5, 2025
17 checks passed
@JojiiOfficial JojiiOfficial deleted the fix_memory_leak_in_distance_matrix branch June 5, 2025 15:46
generall pushed a commit that referenced this pull request Jul 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants