-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Scroll lock: make sure no segments are modified between scroll and retrieve #6289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes introduce a new synchronization mechanism across several components handling collection and segment operations. A write lock is now acquired in the update process by using a blocking mechanism to lock a shared resource before proceeding with update actions. Additionally, a new public field for the scroll lock has been added to the Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🧰 Additional context used🧬 Code Definitions (1)lib/storage/src/content_manager/snapshots/recover.rs (1)
⏰ Context from checks skipped due to timeout of 90000ms (13)
🔇 Additional comments (3)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
lib/collection/src/collection_manager/holders/segment_holder.rs (1)
159-166
: Good addition for scroll operation consistency.The introduction of a dedicated lock for scroll operations is a sound approach to ensure data consistency. This helps prevent the problem where points might change between the scroll and retrieve phases.
Some implementation considerations:
- The lock is an async RwLock which allows for efficient concurrent scrolling operations while blocking updates.
- Documentation clearly explains the purpose and usage pattern.
The empty tuple
()
as a lock resource is a common pattern for mutually exclusive access without the need for data. This approach ensures proper synchronization with minimal overhead.lib/collection/src/collection_manager/collection_updater.rs (1)
47-48
: Acquire write lock on scroll operations for thread safety.This implementation ensures that no scroll operations can run while an update is in progress, preventing potential race conditions. The lock is held for the entire duration of the update operation.
Consider renaming
_scroll_lock
to_scroll_lock_guard
to make it clear this is a guard pattern and the lock remains held until this variable is dropped. Additionally, since this is a blocking operation in an otherwise non-blocking context, it might be worth adding a comment explaining why this synchronous pattern is necessary.- let _scroll_lock = futures::executor::block_on(scroll_lock.write()); + // Block current thread to acquire write lock, ensuring no ongoing scroll operations + // can be affected by this update + let _scroll_lock_guard = futures::executor::block_on(scroll_lock.write());lib/collection/src/shards/local_shard/scroll.rs (1)
463-463
: Consistent lock release pattern applied to all scroll methods.The same explicit lock dropping pattern is applied here, which is excellent for consistency.
Consider adding a brief comment in each scroll method explaining the importance of holding the lock throughout the entire scroll+retrieve operation. This would help future maintainers understand why this locking pattern is crucial.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
lib/collection/src/collection_manager/collection_updater.rs
(1 hunks)lib/collection/src/collection_manager/holders/segment_holder.rs
(1 hunks)lib/collection/src/shards/local_shard/scroll.rs
(6 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
lib/collection/src/collection_manager/collection_updater.rs (2)
lib/collection/src/collection_manager/holders/segment_holder.rs (2)
segments
(1447-1450)segments
(1996-1999)lib/collection/src/shards/local_shard/mod.rs (1)
segments
(241-243)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: test (macos-latest)
- GitHub Check: test-consensus-compose
- GitHub Check: test-consensus
- GitHub Check: test (windows-latest)
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test
- GitHub Check: test (ubuntu-latest)
🔇 Additional comments (3)
lib/collection/src/shards/local_shard/scroll.rs (3)
152-159
: Consistent scroll lock implementation for thread safety.This implementation safely acquires the lock and segment state in a way that prevents race conditions. First retrieving all necessary data within a read lock scope, then acquiring a longer-term read lock on the
scroll_read_lock
specifically.The same pattern is consistently applied to all scroll methods in this file, which is excellent for maintainability.
215-215
: Explicit lock release improves resource management.Explicitly dropping the scroll lock as soon as retrieve is complete ensures that update operations aren't blocked any longer than necessary.
This pattern is consistently applied across all scroll methods, ensuring resource locks are held for the minimum required time.
317-317
: Consistent explicit lock release pattern.Same pattern as in scroll_by_id - explicitly releasing the lock when no longer needed.
Could you elaborate on why this doesn't help with deleted points? Can't we detect deleted points by comparing point count between the two calls? The only way we could resolve the situation in that case is to re-fetch. I don't suggest that to be a good idea, but I'm just curious on why it didn't work out. |
@@ -44,6 +44,9 @@ impl CollectionUpdater { | |||
) -> CollectionResult<usize> { | |||
// Allow only one update at a time, ensure no data races between segments. | |||
// let _lock = self.update_lock.lock().unwrap(); | |||
let scroll_lock = segments.read().scroll_read_lock.clone(); | |||
let _scroll_lock = futures::executor::block_on(scroll_lock.write()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can deadlock, because it might block the current async exec thread
I think the versions solve the problem of returning different payload than what the initial selection stated. However, it doesn't solve the problem of returning less points than initially selected. In regular queries it shouldn't be problematic, because we don't rely on the last point to provide a |
@coszio I don't get why this is a problem. We might only select less points in the second call. We can detect it and act accordingly. So what's the problem? I imagine this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we rely on scrolling for transfers, we need them to be accurate, so I cannot oppose this change.
However, we should be aware that this can hurt some performance under specific workloads.
In manual testing, using perf
builds, I created a collection with
bfb --segments 100 -n 100000 --on-disk-payload --text-payloads --text-payload-length 500
Then I ran scrolls and upserts simultaneously
- Upsert:
bfb --skip-create -n 100000 --text-payloads --text-payload-length 500 --batch-size 1 --throttle 1000
.- this didn't trigger optimizations, so that it reflects only the upsert effects
- Scroll:
bfb --skip-create --skip-upload --skip-wait-index --scroll
I observed some degradation in RPS
Branch | Upsert RPS (thottled to 1000) | Scroll RPS |
---|---|---|
dev | 999.97 | 1,996.85 |
scroll-lock | 866.80 | 1,722.63 |
As a baseline, scrolls without upserting simultaneously reached up to around 2,300 RPS
Thank you very much for the careful review. It'd be nice to not have this impact, but consistency is more important here so we don't really have a choice. Maybe some of our future efforts to clean locks around this area up a bit will have a positive effect 🤞 |
…trieve (qdrant#6289) * Scroll lock: make sure no segments are modified between scroll and retrieve * Take write lock in block_in_place to prevent blocking tokio runtime * Use blocking write * Recover collection snapshot on general runtime * Make function const --------- Co-authored-by: timvisee <tim@visee.me>
This PR fixes rare but possible loss of points in scroll results:
due to 2-stage search+retrieve approach in scroll, we can't guarantee that points retrieved on the first stage will be available and/or unchanged on the second stage. This might create a problem of either returning changed payload, which doesn't satisfy filters anymore, or compromise number of returned points.
It is especially critical for scroll request, which relies on number of returned points to generate a "next_page" offset, and is used in internal operations like shard streaming.
What I have tried and didn't work:
Current approach drawbacks: