-
Notifications
You must be signed in to change notification settings - Fork 1.8k
MADV_POPULATE_READ on sequential mmaps #6923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis change modifies the internal implementation of the Estimated code review effort2 (~15 minutes) Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🧠 Learnings (5)📓 Common learnings
lib/gridstore/src/page.rs (3)Learnt from: generall Learnt from: generall Learnt from: coszio lib/common/memory/src/chunked_utils.rs (3)Learnt from: generall Learnt from: generall Learnt from: coszio lib/segment/src/vector_storage/dense/mmap_dense_vectors.rs (4)Learnt from: generall Learnt from: generall Learnt from: coszio Learnt from: timvisee lib/common/memory/src/mmap_type_readonly.rs (3)Learnt from: generall Learnt from: generall Learnt from: timvisee 🧰 Additional context used🧠 Learnings (5)📓 Common learnings
lib/gridstore/src/page.rs (3)Learnt from: generall Learnt from: generall Learnt from: coszio lib/common/memory/src/chunked_utils.rs (3)Learnt from: generall Learnt from: generall Learnt from: coszio lib/segment/src/vector_storage/dense/mmap_dense_vectors.rs (4)Learnt from: generall Learnt from: generall Learnt from: coszio Learnt from: timvisee lib/common/memory/src/mmap_type_readonly.rs (3)Learnt from: generall Learnt from: generall Learnt from: timvisee 🔇 Additional comments (5)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Awesome work 👏 I spent some time investigating some page fault around |
It turns out
MADV_POPULATE_READ
performance is influenced by theMADV_RANDOM
andMADV_SEQUENTIAL
flags.Obvious in hindsight, but still surprising that the kernel won't enable readahead unconditionlly given that it knows that it will be reading the whole file.
Profiler
It found using this profiler: https://github.com/xzfc/qdrant-tools/tree/3934acc179c9187a6b90d470de9a48fd0a9dd3c9/bpf-mfault-paths
Fix
Since a lot of our files mmaped twice (for sequential and for random), the fix in this PR is trivial: let
populate()
use the sequential mmap if it available.Fixed:
Some mmaps are not duplicated, so these are not fixed:
MmapBitSlice
(deleted flags).bitmask.dat
andgaps.dat
; used in sparse vectors)CPU usage/saturation on 400M benchmark (left is before this commit, right is after):
