Do not rate limit scroll operations for shard transfers #6118

agourlay · 2025-03-07T08:29:13Z

Do not apply read rate limit to scroll operation issued for shard transfers.

The fix relies on leveraging the work done by tagging internal operations with disposable hardware measurements.

Now all operations with disposable hardware measurements by-pass the read rate limiter.

agourlay · 2025-03-07T08:35:10Z

tests/consensus_tests/test_shard_transfer_rate_limiting.py

    r = requests.patch(
        f"{source_uri}/collections/test_collection", json={
            "strict_mode_config": {
                "enabled": True,
                "write_rate_limit": 1,
-                # TODO(ratelimit) validate read limits as well "read_rate_limit": 1
+                "read_rate_limit": 1,


without the fix, enabling this line makes the test fail due to read rate limit reached :)

coderabbitai · 2025-03-07T10:32:26Z

📝 Walkthrough

Walkthrough

This pull request makes several modifications across multiple files. In the collection and shard modules, TODO comment lines related to rate limiting are removed from the clean_task and transfer_batch methods. The LocalShard implementation has been updated to enhance its check_read_rate_limiter method; it now accepts additional parameters—a hardware measurement accumulator and a context string—to better handle disposable operations and provide improved logging for rate limiter errors. The corresponding asynchronous shard operation methods have been updated to pass these new parameters. Additionally, the hardware accumulator struct now includes a new boolean field disposable with a corresponding accessor method and adjustments in its cloning behavior. Test files have been revised to toggle read rate limits for shard transfers, and new utility functions have been added to check and wait for disabled strict mode configurations.

Suggested reviewers

timvisee

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

tests/consensus_tests/test_shard_transfer_rate_limiting.py (1)
169-170: Clarity improvement for comment

The comment describes waiting for "propagation of disable command", but the function being called is wait_for_strict_mode_enabled. Consider updating the comment for better clarity.
-    # Wait for propagation of disable command
+    # Wait for propagation of enable command
🧰 Tools

🪛 Ruff (0.8.2)

170-170: wait_for_strict_mode_enabled may be undefined, or defined from star imports

(F405)
tests/consensus_tests/utils.py (1)
503-507: Good implementation of utility function to check disabled strict mode

This function complements the existing check_strict_mode_enabled function, providing a symmetric approach for checking strict mode status.

There's a small improvement to apply to follow Python best practices:
def check_strict_mode_disabled(peer_api_uri: str, collection_name: str) -> bool:
    collection_info = get_collection_info(peer_api_uri, collection_name)
    strict_mode_enabled = collection_info["config"]["strict_mode_config"]["enabled"]
-    return strict_mode_enabled == False
+    return not strict_mode_enabled
🧰 Tools

🪛 Ruff (0.8.2)

506-506: Avoid equality comparisons to False; use if not strict_mode_enabled: for false checks

Replace with not strict_mode_enabled

(E712)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 293e2e9 and eee7c54.

📒 Files selected for processing (7)

lib/collection/src/collection/clean.rs (0 hunks)
lib/collection/src/shards/forward_proxy_shard.rs (0 hunks)
lib/collection/src/shards/local_shard/mod.rs (1 hunks)
lib/collection/src/shards/local_shard/shard_ops.rs (11 hunks)
lib/common/common/src/counter/hardware_accumulator.rs (5 hunks)
tests/consensus_tests/test_shard_transfer_rate_limiting.py (6 hunks)
tests/consensus_tests/utils.py (2 hunks)

💤 Files with no reviewable changes (2)

lib/collection/src/shards/forward_proxy_shard.rs
lib/collection/src/collection/clean.rs

🧰 Additional context used

🪛 Ruff (0.8.2)

tests/consensus_tests/utils.py

506-506: Avoid equality comparisons to False; use if not strict_mode_enabled: for false checks

Replace with not strict_mode_enabled

(E712)

tests/consensus_tests/test_shard_transfer_rate_limiting.py

86-86: requests may be undefined, or defined from star imports

(F405)

134-134: requests may be undefined, or defined from star imports

(F405)

143-143: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)

159-159: requests may be undefined, or defined from star imports

(F405)

170-170: wait_for_strict_mode_enabled may be undefined, or defined from star imports

(F405)

191-191: requests may be undefined, or defined from star imports

(F405)

200-200: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)

216-216: requests may be undefined, or defined from star imports

(F405)

227-227: wait_for_strict_mode_enabled may be undefined, or defined from star imports

(F405)

248-248: requests may be undefined, or defined from star imports

(F405)

257-257: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)

🔇 Additional comments (20)

tests/consensus_tests/test_shard_transfer_rate_limiting.py (6)

91-91: Enables read rate limiting for the test

The previously commented-out line for setting read_rate_limit is now enabled, which aligns with the PR's objective of testing shard transfers with read rate limiting. Previously, this needed to be disabled because shard transfers would fail when hitting read rate limits.

133-144: Well-structured test approach for disabling rate limits to verify data consistency

This new section disables read limits after a shard transfer to ensure data consistency can be verified without being affected by rate limiting. This pattern allows the test to validate both that:

Shard transfers complete successfully with read rate limiting enabled (due to the PR changes)

The transferred data is consistent across peers

A good testing practice to isolate concerns.

🧰 Tools

🪛 Ruff (0.8.2)

134-134: requests may be undefined, or defined from star imports

(F405)

143-143: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)

158-171: Test setup re-enables rate limiting for further validation steps

Re-enabling rate limiting after the consistency check is important to ensure subsequent shard transfer tests are performed under the same rate-limited conditions, properly validating the PR's objective.

🧰 Tools

🪛 Ruff (0.8.2)

159-159: requests may be undefined, or defined from star imports

(F405)

170-170: wait_for_strict_mode_enabled may be undefined, or defined from star imports

(F405)

190-201: Consistent pattern repeated for each test segment

The consistent approach of disabling rate limits to check data consistency after shard operations ensures thorough validation of the core functionality. This pattern is correctly repeated across the test to verify data consistency after each shard operation.

Also applies to: 247-258

🧰 Tools

🪛 Ruff (0.8.2)

191-191: requests may be undefined, or defined from star imports

(F405)

200-200: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)

215-228: Proper test scaffolding ensures comprehensive validation

The re-enablement of rate limiting after each consistency check ensures that subsequent shard operations are properly tested under rate limiting conditions. This comprehensive approach thoroughly validates the PR's objective of allowing shard transfers to work properly even with rate limiting enabled.

🧰 Tools

🪛 Ruff (0.8.2)

216-216: requests may be undefined, or defined from star imports

(F405)

227-227: wait_for_strict_mode_enabled may be undefined, or defined from star imports

(F405)

143-143:
✅ Verification successful

Verify the imported utility function

This line uses a new utility function wait_for_strict_mode_disabled imported via star imports.

🏁 Script executed:
#!/bin/bash
# Verify the utility function implementation
grep -A 10 "def wait_for_strict_mode_disabled" tests/consensus_tests/utils.py
Length of output: 553
Utility Function Verification Complete
The utility function wait_for_strict_mode_disabled in tests/consensus_tests/utils.py has been confirmed to be implemented correctly. It calls wait_for with the check_strict_mode_disabled condition and properly handles exceptions by logging cluster info. The call in tests/consensus_tests/test_shard_transfer_rate_limiting.py line 143 uses this function appropriately via star-imports.

🧰 Tools

🪛 Ruff (0.8.2)

143-143: wait_for_strict_mode_disabled may be undefined, or defined from star imports

(F405)
tests/consensus_tests/utils.py (1)

633-639: Properly implemented wait function for strict mode disabled status

This utility function complements the existing wait_for_strict_mode_enabled function, providing a symmetric mechanism to wait until strict mode is disabled. It correctly uses the wait_for utility function and handles exceptions appropriately.

lib/common/common/src/counter/hardware_accumulator.rs (3)

103-104: Well-documented new field for disposable operations

The addition of the disposable boolean field with clear documentation explains its purpose: when set to true, the accumulator will not accumulate any values. This is central to the PR's goal of allowing certain operations to bypass rate limiting.

113-114: Consistent initialization of the disposable field across all constructors

The disposable field is properly initialized in all relevant constructors and maintained in the clone implementation, ensuring consistent behavior. The disposable() constructor explicitly sets it to true while other constructors set it to false by default.

Also applies to: 124-125, 141-142, 192-193

128-130: Simple accessor method for the disposable flag

The is_disposable method provides a clean accessor for the disposable flag, following Rust's naming conventions for boolean getters. This will be used by the shard operations to determine whether to bypass rate limiting.
lib/collection/src/shards/local_shard/mod.rs (4)

1202-1207: Enhanced method signature for better context and control

The check_read_rate_limiter method now accepts additional parameters:

hw_measurement_acc: To check if the operation is tagged as disposable

context: To provide better error messages when rate limiting is enforced

This change enables the core functionality of the PR while also improving the debugging experience.

1208-1211: Key PR implementation: bypass rate limiting for disposable operations

This is the core of the PR's implementation - operations tagged with disposable hardware measurements now bypass the read rate limiter. This change allows internal operations like scroll operations during shard transfers to proceed without being rate-limited, addressing the issue described in the PR's objectives.

1216-1219: Improved error logging for rate limiter errors

The enhanced error handling logs the context and error details when rate limiting is enforced, which will aid in debugging rate limiter issues. This is a good improvement beyond the core PR objective.

1202-1222:
✅ Verification successful

Verify callers are updated with the new parameters

The method signature has changed, so all callers must be updated to pass the two new parameters.

🏁 Script executed:
#!/bin/bash
# Find all callers of check_read_rate_limiter and verify they're updated
grep -r "check_read_rate_limiter" lib/collection/src/shards --include="*.rs" | grep -v "fn check_read_rate_limiter"
Length of output: 896
Verified: Callers are Updated with New Parameters
The grep output confirms that all callers in lib/collection/src/shards/local_shard/shard_ops.rs are passing both the hw_measurement_acc and the new context parameters as required by the updated method signature. No further changes are needed in this area.
lib/collection/src/shards/local_shard/shard_ops.rs (6)

113-127: Improvement: Enhanced rate limiting with operation context

The addition of documentation and passing hardware measurement accumulator to the rate limiter check enhances the system's ability to selectively bypass rate limiting for operations associated with shard transfers, which aligns with the PR objectives.

176-185: Consistent implementation of rate limiting control

The pattern of passing the hardware measurement accumulator and operation context is consistently applied here, with appropriate cost calculation based on the number of searches in the request.

190-199: Well-implemented rate limiting control

The count method now correctly passes the hardware measurement context, enabling the skip of rate limiting for operations with disposable measurements.

222-233: Proper implementation of rate limiting bypass mechanism

The retrieve method follows the same pattern of passing hardware measurements to the rate limiter check, maintaining consistency across all rate-limited operations.

258-269: Smart rate limiting cost calculation

Good implementation of cost calculation that accounts for both searches and scrolls in the planned query, providing an accurate representation of the computational cost for rate limiting purposes.

279-288: Consistent application of rate limiting bypass

The facet method implementation completes the consistent pattern applied across all rate-limited operations, ensuring uniform behavior with the new rate limiting bypass feature.

timvisee

I like the idea of the disposable flag, to just reuse the HwMeasurement type.

I explicitly checks this is sound for all shard transfer types. 👍

agourlay added this to the Rate limiting milestone Mar 7, 2025

Do not rate limit scroll operations for shard transfers

eee7c54

agourlay force-pushed the no-rate-limite-scroll-for-transfers branch from 09b61c1 to eee7c54 Compare March 7, 2025 08:31

agourlay commented Mar 7, 2025

View reviewed changes

agourlay marked this pull request as ready for review March 7, 2025 10:29

agourlay requested a review from timvisee March 7, 2025 10:30

coderabbitai bot reviewed Mar 7, 2025

View reviewed changes

timvisee approved these changes Mar 7, 2025

View reviewed changes

agourlay requested a review from JojiiOfficial March 7, 2025 11:26

JojiiOfficial approved these changes Mar 7, 2025

View reviewed changes

agourlay merged commit 7ba47bb into dev Mar 7, 2025
17 checks passed

agourlay deleted the no-rate-limite-scroll-for-transfers branch March 7, 2025 11:45

agourlay mentioned this pull request Mar 10, 2025

Do not rate limit resharding shard cleaning #6121

Merged

coderabbitai bot mentioned this pull request Mar 11, 2025

Write rate limit batch update per point count #6152

Merged

timvisee pushed a commit that referenced this pull request Mar 21, 2025

Do not rate limit scroll operations for shard transfers (#6118)

1f4809f

timvisee mentioned this pull request Mar 21, 2025

Bump version to 1.13.5 #6223

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not rate limit scroll operations for shard transfers #6118

Do not rate limit scroll operations for shard transfers #6118

Uh oh!

agourlay commented Mar 7, 2025 •

edited

Loading

Uh oh!

agourlay Mar 7, 2025

Uh oh!

coderabbitai bot commented Mar 7, 2025

Walkthrough

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

timvisee left a comment

Uh oh!

Uh oh!

Uh oh!

Do not rate limit scroll operations for shard transfers #6118

Do not rate limit scroll operations for shard transfers #6118

Uh oh!

Conversation

agourlay commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agourlay Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Mar 7, 2025

Walkthrough

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

timvisee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

agourlay commented Mar 7, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)