Skip to content

Conversation

coszio
Copy link
Contributor

@coszio coszio commented Apr 25, 2025

Found this TODO while inspecting the code, I don't have great numbers to share, but when profiling the bustle_bench, the flamegraph proved that flushing takes less time when there are lots of updates.

Before:
imagen

After:
imagen

Copy link
Contributor

coderabbitai bot commented Apr 25, 2025

📝 Walkthrough

Walkthrough

The changes introduce a new batch processing method for updating block usage within the bitmask in the lib/gridstore module. Specifically, a new method mark_blocks_batch is added to the Bitmask implementation, which allows marking multiple block ranges within a page as used or free in a single operation. The existing mark_blocks method is refactored to act as a wrapper that converts its parameters into a single-element iterator and delegates to mark_blocks_batch. In the Gridstore implementation, the flush method is updated to use this batch functionality: instead of marking blocks as free one by one, it groups old pointers by page and block ranges, then calls mark_blocks_batch per page to update the bitmask in bulk. The itertools::Itertools trait is imported to facilitate the grouping of pointers. No public API signatures are changed as a result of these updates.

Additionally, a new benchmark named flush_bench is added to the Cargo.toml and implemented in benches/flush_bench.rs. This benchmark uses the Criterion framework to measure the performance of the flush operation under sequential and random update patterns with varying batch sizes. It prepopulates storage with 10,000 records and times flush operations after applying updates in different batch sizes, tracking hardware counters for payload I/O writes.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@agourlay
Copy link
Member

Is it possible to make a criterion bench for this in order to have a clearer signal?

@coszio coszio force-pushed the free-bitmask-in-batch branch from c0ea248 to 091a9a6 Compare April 29, 2025 16:11
@coszio
Copy link
Contributor Author

coszio commented Apr 29, 2025

@agourlay I have now added a benchmark to facilitate looking at specific numbers

❯ cargo bench --bench flush_bench -- --baseline dev --load-baseline batched
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.15s
     Running benches/flush_bench.rs (/Users/luis/personal/Qdrant/qdrant/target/release/deps/flush_bench-bf4743a4ac2e299f)
Gnuplot not found, using plotters backend
flush after 100 sequential writes
                        time:   [151.85 µs 153.92 µs 156.33 µs]
                        change: [-20.549% -9.2999% +3.7517%] (p = 0.16 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe

flush after 1000 sequential writes
                        time:   [399.93 µs 406.99 µs 414.63 µs]
+                        change: [-28.520% -27.172% -25.775%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

flush after 10000 sequential writes
                        time:   [2.4493 ms 2.4771 ms 2.5052 ms]
+                        change: [-57.140% -56.567% -56.033%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

flush after 100 random writes
                        time:   [1.1900 ms 1.1994 ms 1.2095 ms]
+                        change: [-6.4605% -4.4169% -2.2229%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

flush after 1000 random writes
                        time:   [1.4221 ms 1.4341 ms 1.4471 ms]
+                        change: [-19.782% -18.579% -17.367%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

flush after 10000 random writes
                        time:   [3.1180 ms 3.1594 ms 3.2008 ms]
+                        change: [-59.880% -59.348% -58.822%] (p = 0.00 < 0.05)
                        Performance has improved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
lib/gridstore/benches/flush_bench.rs (3)

8-87: Consider reducing code duplication between benchmarks

The sequential and random write benchmarks share almost identical setup code and only differ in how they select record IDs for updates. Consider extracting the common code into a helper function to improve maintainability.

fn run_benchmark(c: &mut Criterion, name: &str, prepopulation_size: usize, unflushed_updates: usize, update_fn: impl Fn(&mut rand::prelude::ThreadRng, usize) -> usize) {
    c.bench_function(name, |b| {
        // Setup: Create a storage with a specified number of records
        let (_dir, mut storage) = empty_storage();
        let mut rng = rand::thread_rng();
        let hw_counter = HardwareCounterCell::new();
        let hw_counter_ref = hw_counter.ref_payload_io_write_counter();

        // Pre-populate storage with data
        for i in 0..prepopulation_size {
            let payload = random_payload(&mut rng, 1);
            storage.put_value(i, &payload, hw_counter_ref).unwrap();
        }

        b.iter_custom(|iters| {
            let mut total_elapsed = Duration::ZERO;
            for _ in 0..iters {
                // Apply updates using the provided update strategy
                for _ in 0..unflushed_updates {
                    let id = update_fn(&mut rng, prepopulation_size);
                    let payload = random_payload(&mut rng, 1);
                    storage.put_value(id, &payload, hw_counter_ref).unwrap();
                }

                // Benchmark the flush operation
                let instant = Instant::now();
                storage.flush().unwrap();
                total_elapsed += instant.elapsed();
            }
            total_elapsed
        });
    });
}

pub fn flush_bench(c: &mut Criterion) {
    let prepopulation_size = 10_000;

    // Test with different update counts
    for &unflushed_updates in &[100, 1_000, prepopulation_size] {
        // Sequential updates
        let bench_name = format!("flush after {} sequential writes", unflushed_updates);
        let mut counter = 0;
        run_benchmark(c, &bench_name, prepopulation_size, unflushed_updates, 
            |_, _| { let id = counter; counter = (counter + 1) % unflushed_updates; id });
        
        // Random updates
        let bench_name = format!("flush after {} random writes", unflushed_updates);
        run_benchmark(c, &bench_name, prepopulation_size, unflushed_updates,
            |rng, size| rng.gen_range(0..size));
    }
}

15-46: Consider configuring benchmark parameters for more meaningful results

The current benchmark doesn't specify custom measurement parameters like sample size or measurement time. Consider configuring these parameters to improve benchmark reliability, especially for the larger workloads.

-        c.bench_function(&bench_name, |b| {
+        c.bench_function(&bench_name, |b| {
+            // Configure measurement parameters
+            b.measurement_time(Duration::from_secs(20));
+            b.sample_size(30);
+            
             // Setup: Create a storage with a specified number of records

This applies to both benchmark functions.


24-25: Consider testing with varying payload sizes

The benchmark currently uses very small payloads (size 1) for all operations. Consider testing with different payload sizes to understand how payload size affects flush performance, especially since the optimization might perform differently with varying data sizes.

You could add another dimension to your benchmarks:

// Example modification for payload size testing
for payload_size in [1, 100, 1000] {
    for unflushed_updates in [100, 1_000, prepopulation_size].iter() {
        let bench_name = format!("flush after {} sequential writes with payload size {}", 
                                unflushed_updates, payload_size);
        
        // Then use `payload_size` variable instead of the hardcoded 1
        let payload = random_payload(&mut rng, payload_size);
        // ...
    }
}

Also applies to: 62-63, 33-35, 71-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0ea248 and 091a9a6.

📒 Files selected for processing (4)
  • lib/gridstore/Cargo.toml (1 hunks)
  • lib/gridstore/benches/flush_bench.rs (1 hunks)
  • lib/gridstore/src/bitmask/mod.rs (1 hunks)
  • lib/gridstore/src/gridstore.rs (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • lib/gridstore/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/gridstore/src/bitmask/mod.rs
  • lib/gridstore/src/gridstore.rs

@coszio coszio merged commit 1864e79 into dev Apr 29, 2025
@coszio coszio deleted the free-bitmask-in-batch branch April 29, 2025 20:22
generall pushed a commit that referenced this pull request May 22, 2025
* update region gaps in batch when flushing

* rename to relative -> local

* add flushing bench
This was referenced Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants