Skip to content

Conversation

sstadick
Copy link
Owner

@sstadick sstadick commented Jul 2, 2025

The BITS paper uses fully inclusive ranges. When this was written I'm not sure I understood that. To get things to match the naive version of count (find -> count) I had a while loop in the bits count method to advance the cursor past the matched start/stop index.

This change was found while porting mojo-lapper and removes the while loop. It also has a more effecient branchless binary search.

Benchmarks for count improve 24-30% (see PR).

Bakeoff/rust-lapper: count with 100% hit rate                                                                             
                        time:   [14.742 us 14.773 us 14.811 us]
                        change: [-24.448% -24.164% -23.869%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  18 (18.00%) high severe
Bakeoff/rust-lapper: count with below 100% hit rate                                                                             
                        time:   [2.1958 us 2.2019 us 2.2076 us]
                        change: [-30.391% -29.790% -29.274%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) low mild                                                                                                                                              
Benchmarking Bakeoff/rust-lapper: count with below 100% hit rate - chromosome spanning interval: Collecting 100 samples in estimated 5.0080 s (2.3M ite                                                                                                                                                       Bakeoff/rust-lapper: count with below 100% hit rate - chromosome spanning interval                        
                        time:   [2.1989 us 2.2070 us 2.2148 us]
                        change: [-32.616% -30.807% -29.536%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) low mild
  1 (1.00%) high severe

@sstadick sstadick force-pushed the fix/bits_iv_inclusive branch 2 times, most recently from b8e3b88 to 9cc7957 Compare July 2, 2025 09:36
@sstadick sstadick requested a review from Copilot July 2, 2025 09:36
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the interval counting logic to remove an extra cursor-advancing loop, replaces the traditional binary search with a branchless version, and bumps the crate version to 1.2.0.

  • Remove mutable top binding and simplify stack handling
  • Introduce branchless bsearch_seq, eliminate inclusive-range loop in count
  • Bump rust-lapper version from 1.1.0 to 1.2.0

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
src/lib.rs Simplified stack pop, replaced binary search with branchless variant, adjusted half-open logic in count, and left a debug println! in tests
Cargo.toml Updated package version to 1.2.0
Comments suppressed due to low confidence (3)

src/lib.rs:431

  • [nitpick] The variable name cursor is ambiguous in the context of a search index; consider renaming to something like index or position, and likewise rename length to remaining or span for clarity.
        let mut cursor = 0;

src/lib.rs:587

  • [nitpick] It would be helpful to add unit tests targeting edge cases around the start + one::<I>() adjustment to ensure the half-open interval logic remains correct under all boundary conditions.
        let first = Self::bsearch_seq(start + one::<I>(), &self.stops);

src/lib.rs:1009

  • This debug println! in the test appears to be leftover and will clutter test output; consider removing it.
        println!("{:?}",lapper);

The BITS paper uses fully inclusive ranges. When this was written
I'm not sure I understood that. To get things to match the naive
version of count (find -> count) I had a while loop in the bits
count method to advance the cursor past the matched start/stop
index.

This change was found while porting mojo-lapper and removes the
while loop. It also has a more effecient branchless binary search.

Benchmarks for count improve 24-30% (see PR).
@sstadick sstadick force-pushed the fix/bits_iv_inclusive branch from 9cc7957 to 6f55829 Compare July 2, 2025 09:41
@sstadick sstadick merged commit 7e3904d into master Jul 2, 2025
10 checks passed
@sstadick sstadick deleted the fix/bits_iv_inclusive branch July 2, 2025 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant