-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Do not rate limit scroll operations for shard transfers #6118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
09b61c1 to
eee7c54
Compare
| r = requests.patch( | ||
| f"{source_uri}/collections/test_collection", json={ | ||
| "strict_mode_config": { | ||
| "enabled": True, | ||
| "write_rate_limit": 1, | ||
| # TODO(ratelimit) validate read limits as well "read_rate_limit": 1 | ||
| "read_rate_limit": 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without the fix, enabling this line makes the test fail due to read rate limit reached :)
📝 WalkthroughWalkthroughThis pull request makes several modifications across multiple files. In the collection and shard modules, TODO comment lines related to rate limiting are removed from the Suggested reviewers
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/consensus_tests/test_shard_transfer_rate_limiting.py (1)
169-170: Clarity improvement for commentThe comment describes waiting for "propagation of disable command", but the function being called is
wait_for_strict_mode_enabled. Consider updating the comment for better clarity.- # Wait for propagation of disable command + # Wait for propagation of enable command🧰 Tools
🪛 Ruff (0.8.2)
170-170:
wait_for_strict_mode_enabledmay be undefined, or defined from star imports(F405)
tests/consensus_tests/utils.py (1)
503-507: Good implementation of utility function to check disabled strict modeThis function complements the existing
check_strict_mode_enabledfunction, providing a symmetric approach for checking strict mode status.There's a small improvement to apply to follow Python best practices:
def check_strict_mode_disabled(peer_api_uri: str, collection_name: str) -> bool: collection_info = get_collection_info(peer_api_uri, collection_name) strict_mode_enabled = collection_info["config"]["strict_mode_config"]["enabled"] - return strict_mode_enabled == False + return not strict_mode_enabled🧰 Tools
🪛 Ruff (0.8.2)
506-506: Avoid equality comparisons to
False; useif not strict_mode_enabled:for false checksReplace with
not strict_mode_enabled(E712)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
lib/collection/src/collection/clean.rs(0 hunks)lib/collection/src/shards/forward_proxy_shard.rs(0 hunks)lib/collection/src/shards/local_shard/mod.rs(1 hunks)lib/collection/src/shards/local_shard/shard_ops.rs(11 hunks)lib/common/common/src/counter/hardware_accumulator.rs(5 hunks)tests/consensus_tests/test_shard_transfer_rate_limiting.py(6 hunks)tests/consensus_tests/utils.py(2 hunks)
💤 Files with no reviewable changes (2)
- lib/collection/src/shards/forward_proxy_shard.rs
- lib/collection/src/collection/clean.rs
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/consensus_tests/utils.py
506-506: Avoid equality comparisons to False; use if not strict_mode_enabled: for false checks
Replace with not strict_mode_enabled
(E712)
tests/consensus_tests/test_shard_transfer_rate_limiting.py
86-86: requests may be undefined, or defined from star imports
(F405)
134-134: requests may be undefined, or defined from star imports
(F405)
143-143: wait_for_strict_mode_disabled may be undefined, or defined from star imports
(F405)
159-159: requests may be undefined, or defined from star imports
(F405)
170-170: wait_for_strict_mode_enabled may be undefined, or defined from star imports
(F405)
191-191: requests may be undefined, or defined from star imports
(F405)
200-200: wait_for_strict_mode_disabled may be undefined, or defined from star imports
(F405)
216-216: requests may be undefined, or defined from star imports
(F405)
227-227: wait_for_strict_mode_enabled may be undefined, or defined from star imports
(F405)
248-248: requests may be undefined, or defined from star imports
(F405)
257-257: wait_for_strict_mode_disabled may be undefined, or defined from star imports
(F405)
🔇 Additional comments (20)
tests/consensus_tests/test_shard_transfer_rate_limiting.py (6)
91-91: Enables read rate limiting for the testThe previously commented-out line for setting
read_rate_limitis now enabled, which aligns with the PR's objective of testing shard transfers with read rate limiting. Previously, this needed to be disabled because shard transfers would fail when hitting read rate limits.
133-144: Well-structured test approach for disabling rate limits to verify data consistencyThis new section disables read limits after a shard transfer to ensure data consistency can be verified without being affected by rate limiting. This pattern allows the test to validate both that:
- Shard transfers complete successfully with read rate limiting enabled (due to the PR changes)
- The transferred data is consistent across peers
A good testing practice to isolate concerns.
🧰 Tools
🪛 Ruff (0.8.2)
134-134:
requestsmay be undefined, or defined from star imports(F405)
143-143:
wait_for_strict_mode_disabledmay be undefined, or defined from star imports(F405)
158-171: Test setup re-enables rate limiting for further validation stepsRe-enabling rate limiting after the consistency check is important to ensure subsequent shard transfer tests are performed under the same rate-limited conditions, properly validating the PR's objective.
🧰 Tools
🪛 Ruff (0.8.2)
159-159:
requestsmay be undefined, or defined from star imports(F405)
170-170:
wait_for_strict_mode_enabledmay be undefined, or defined from star imports(F405)
190-201: Consistent pattern repeated for each test segmentThe consistent approach of disabling rate limits to check data consistency after shard operations ensures thorough validation of the core functionality. This pattern is correctly repeated across the test to verify data consistency after each shard operation.
Also applies to: 247-258
🧰 Tools
🪛 Ruff (0.8.2)
191-191:
requestsmay be undefined, or defined from star imports(F405)
200-200:
wait_for_strict_mode_disabledmay be undefined, or defined from star imports(F405)
215-228: Proper test scaffolding ensures comprehensive validationThe re-enablement of rate limiting after each consistency check ensures that subsequent shard operations are properly tested under rate limiting conditions. This comprehensive approach thoroughly validates the PR's objective of allowing shard transfers to work properly even with rate limiting enabled.
🧰 Tools
🪛 Ruff (0.8.2)
216-216:
requestsmay be undefined, or defined from star imports(F405)
227-227:
wait_for_strict_mode_enabledmay be undefined, or defined from star imports(F405)
143-143:✅ Verification successful
Verify the imported utility function
This line uses a new utility function
wait_for_strict_mode_disabledimported via star imports.
🏁 Script executed:
#!/bin/bash # Verify the utility function implementation grep -A 10 "def wait_for_strict_mode_disabled" tests/consensus_tests/utils.pyLength of output: 553
Utility Function Verification Complete
The utility functionwait_for_strict_mode_disabledintests/consensus_tests/utils.pyhas been confirmed to be implemented correctly. It callswait_forwith thecheck_strict_mode_disabledcondition and properly handles exceptions by logging cluster info. The call intests/consensus_tests/test_shard_transfer_rate_limiting.pyline 143 uses this function appropriately via star-imports.🧰 Tools
🪛 Ruff (0.8.2)
143-143:
wait_for_strict_mode_disabledmay be undefined, or defined from star imports(F405)
tests/consensus_tests/utils.py (1)
633-639: Properly implemented wait function for strict mode disabled statusThis utility function complements the existing
wait_for_strict_mode_enabledfunction, providing a symmetric mechanism to wait until strict mode is disabled. It correctly uses thewait_forutility function and handles exceptions appropriately.lib/common/common/src/counter/hardware_accumulator.rs (3)
103-104: Well-documented new field for disposable operationsThe addition of the
disposableboolean field with clear documentation explains its purpose: when set to true, the accumulator will not accumulate any values. This is central to the PR's goal of allowing certain operations to bypass rate limiting.
113-114: Consistent initialization of the disposable field across all constructorsThe
disposablefield is properly initialized in all relevant constructors and maintained in thecloneimplementation, ensuring consistent behavior. Thedisposable()constructor explicitly sets it totruewhile other constructors set it tofalseby default.Also applies to: 124-125, 141-142, 192-193
128-130: Simple accessor method for the disposable flagThe
is_disposablemethod provides a clean accessor for thedisposableflag, following Rust's naming conventions for boolean getters. This will be used by the shard operations to determine whether to bypass rate limiting.lib/collection/src/shards/local_shard/mod.rs (4)
1202-1207: Enhanced method signature for better context and controlThe
check_read_rate_limitermethod now accepts additional parameters:
hw_measurement_acc: To check if the operation is tagged as disposablecontext: To provide better error messages when rate limiting is enforcedThis change enables the core functionality of the PR while also improving the debugging experience.
1208-1211: Key PR implementation: bypass rate limiting for disposable operationsThis is the core of the PR's implementation - operations tagged with disposable hardware measurements now bypass the read rate limiter. This change allows internal operations like scroll operations during shard transfers to proceed without being rate-limited, addressing the issue described in the PR's objectives.
1216-1219: Improved error logging for rate limiter errorsThe enhanced error handling logs the context and error details when rate limiting is enforced, which will aid in debugging rate limiter issues. This is a good improvement beyond the core PR objective.
1202-1222:✅ Verification successful
Verify callers are updated with the new parameters
The method signature has changed, so all callers must be updated to pass the two new parameters.
🏁 Script executed:
#!/bin/bash # Find all callers of check_read_rate_limiter and verify they're updated grep -r "check_read_rate_limiter" lib/collection/src/shards --include="*.rs" | grep -v "fn check_read_rate_limiter"Length of output: 896
Verified: Callers are Updated with New Parameters
The grep output confirms that all callers inlib/collection/src/shards/local_shard/shard_ops.rsare passing both thehw_measurement_accand the newcontextparameters as required by the updated method signature. No further changes are needed in this area.lib/collection/src/shards/local_shard/shard_ops.rs (6)
113-127: Improvement: Enhanced rate limiting with operation contextThe addition of documentation and passing hardware measurement accumulator to the rate limiter check enhances the system's ability to selectively bypass rate limiting for operations associated with shard transfers, which aligns with the PR objectives.
176-185: Consistent implementation of rate limiting controlThe pattern of passing the hardware measurement accumulator and operation context is consistently applied here, with appropriate cost calculation based on the number of searches in the request.
190-199: Well-implemented rate limiting controlThe count method now correctly passes the hardware measurement context, enabling the skip of rate limiting for operations with disposable measurements.
222-233: Proper implementation of rate limiting bypass mechanismThe retrieve method follows the same pattern of passing hardware measurements to the rate limiter check, maintaining consistency across all rate-limited operations.
258-269: Smart rate limiting cost calculationGood implementation of cost calculation that accounts for both searches and scrolls in the planned query, providing an accurate representation of the computational cost for rate limiting purposes.
279-288: Consistent application of rate limiting bypassThe facet method implementation completes the consistent pattern applied across all rate-limited operations, ensuring uniform behavior with the new rate limiting bypass feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of the disposable flag, to just reuse the HwMeasurement type.
I explicitly checks this is sound for all shard transfer types. 👍
Do not apply read rate limit to scroll operation issued for shard transfers.
The fix relies on leveraging the work done by tagging internal operations with disposable hardware measurements.
Now all operations with disposable hardware measurements by-pass the read rate limiter.