Skip to content

Conversation

timvisee
Copy link
Member

@timvisee timvisee commented Apr 15, 2025

A very simple change - replace the standard HashMap with AHashMap. Or more specifically, hash using ahash.

It changes the hashmap in the mmap/rocksdb update buffer types where it lists changes pending flush. Its also applied to some surrounding types/functions. There are more places we can benefit from this, but lets take it step by step.

Ahash is much faster, at least on small numeric keys like we're using in many places. Here's a generic benchmark of sip13 (used by standard HashMap) versus ahash:

image
source

I've added a criterion benchmark that tests the change in the buffered mmap bitslice. I expect the effect to be similar for the other types.

Here's a bench for 100 iterations on look ups with 1 million pending changes:

# Before PR
$ cargo bench --bench mmap_bitslice_buffered_update_wrapper
mmap-bitslice-buffered-update-wrapper/lookup-with-pending-changes
                        time:   [43.298 ms 43.611 ms 43.949 ms]

# After PR
$ cargo bench --bench mmap_bitslice_buffered_update_wrapper
mmap-bitslice-buffered-update-wrapper/lookup-with-pending-changes
                        time:   [30.691 ms 30.881 ms 31.113 ms]
                        change: [-29.886% -29.189% -28.441%] (p = 0.00 < 0.05)
                        Performance has improved.

Graph for 100 criterion samples

Red shows the lookup timing distribution for the type before this PR. Blue shows the timings after this PR, being about 30% faster.

Note that in the case of the pending changes list, it may be empty a lot of the time. In that case there is no time difference. The improvement in lookup time will get more and more significant as the pending changes list grows.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@timvisee timvisee changed the title Speed up pending changes lookups Speed up pending changes lookups using ahash Apr 15, 2025
@timvisee timvisee force-pushed the ahash-in-buffer-wrappers-and-payload branch from f85b230 to 054f906 Compare April 15, 2025 10:10
@timvisee timvisee requested review from agourlay, generall, ffuugoo, JojiiOfficial, coszio and Copilot and removed request for agourlay April 15, 2025 10:42
@timvisee timvisee marked this pull request as ready for review April 15, 2025 10:42
Copilot

This comment was marked as resolved.

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@generall generall merged commit 10ef9ee into dev Apr 15, 2025
17 checks passed
@generall generall deleted the ahash-in-buffer-wrappers-and-payload branch April 15, 2025 12:06
pull bot pushed a commit to kp-forks/qdrant that referenced this pull request Apr 21, 2025
* Add bench for mmap_bitslice_buffered_update_wrapper

* Use ahash in buffered mmap/RocksDB types and in payload storage

* Function is used in testing only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants