Skip to content

Conversation

zhassan-aws
Copy link
Contributor

@zhassan-aws zhassan-aws commented Jun 4, 2025

This PR is co-authored with @tautschnig.

Background: In #4127 we enabled certain target features that are platform dependent (e.g. sse and sse2 for x86_64 and neon for aarch64) which resulted in using the simd_bitmask intrinsic more frequently. Kani's current model of that intrinsic (

unsafe fn simd_bitmask_impl<T, const LANES: usize>(input: &[T; LANES]) -> [u8; mask_len(LANES)]
) uses a loop that iterates LANE times, which requires harnesses that touch this code (e.g. that use the HashSet data structure) to have a large unwinding value, which results in a significant slowdown.

This PR refactors this function through rewriting it using a nested loop and manually unwinding the inner loop (which operates over the bits in a byte) so that a large unwinding value is not needed.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@zhassan-aws zhassan-aws requested a review from a team as a code owner June 4, 2025 20:54
@github-actions github-actions bot added the Z-EndToEndBenchCI Tag a PR to run benchmark CI label Jun 4, 2025
@zhassan-aws zhassan-aws enabled auto-merge June 4, 2025 23:36
@zhassan-aws zhassan-aws added this pull request to the merge queue Jun 5, 2025
Merged via the queue into model-checking:main with commit 23c842d Jun 5, 2025
35 of 37 checks passed
@zhassan-aws zhassan-aws deleted the bitmask-refactor branch June 5, 2025 01:23
zhassan-aws added a commit that referenced this pull request Jun 5, 2025
This PR is co-authored with @tautschnig.

Background: In #4127 we
enabled certain target features that are platform dependent (e.g. `sse`
and `sse2` for x86_64 and `neon` for aarch64) which resulted in using
the `simd_bitmask` intrinsic more frequently. Kani's current model of
that intrinsic
(https://github.com/model-checking/kani/blob/5f14b735b74f3ae3f9f1c64ce5656e1e735d42ea/library/kani/src/models/mod.rs#L72)
uses a loop that iterates `LANE` times, which requires harnesses that
touch this code (e.g. that use the `HashSet` data structure) to have a
large unwinding value, which results in a significant slowdown.

This PR refactors this function through rewriting it using a nested loop
and manually unwinding the inner loop (which operates over the bits in a
byte) so that a large unwinding value is not needed.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
github-merge-queue bot pushed a commit that referenced this pull request Jun 9, 2025
These are the automatically-generated release notes:
```
## What's Changed
* Toolchain upgrade to nightly-2025-05-04 by @thanhnguyen-aws in #4059
* Automatic toolchain upgrade to nightly-2025-05-05 by @github-actions in #4060
* Automatic toolchain upgrade to nightly-2025-05-06 by @github-actions in #4061
* Enable target features: x87 and sse2 by @thanhnguyen-aws in #4062
* Fix the bug: Loop contracts are not composable with function contracts  by @thanhnguyen-aws in #3979
* Automatic cargo update to 2025-05-12 by @github-actions in #4066
* Bump tests/perf/s2n-quic from `6aa9975` to `5f323b7` by @dependabot in #4068
* Fix stabilization instructions in RFC intro by @carolynzech in #4067
* Add support for quantifiers by @qinheping in #3993
* Toolchain upgrade to nightly-2025-05-07 by @thanhnguyen-aws in #4070
* Automatic toolchain upgrade to nightly-2025-05-08 by @github-actions in #4071
* Automatic toolchain upgrade to nightly-2025-05-09 by @github-actions in #4072
* Automatic toolchain upgrade to nightly-2025-05-10 by @github-actions in #4073
* Clippy/Stylistic Fixes by @carolynzech in #4074
* Upgrade toolchain to 2025-05-14 by @zhassan-aws in #4076
* Autoharness argument validation: only error on `--quiet` if `--list` was passed by @carolynzech in #4069
* Upgrade Rust toolchain to 2025-05-16 by @zhassan-aws in #4080
* Automatic toolchain upgrade to nightly-2025-05-17 by @github-actions in #4081
* Add setup scripts for Ubuntu 20.04 by @zhassan-aws in #4082
* Automatic toolchain upgrade to nightly-2025-05-18 by @github-actions in #4083
* Automatic cargo update to 2025-05-19 by @github-actions in #4086
* Automatic toolchain upgrade to nightly-2025-05-19 by @github-actions in #4085
* Automatic toolchain upgrade to nightly-2025-05-20 by @github-actions in #4091
* Bump tests/perf/s2n-quic from `5f323b7` to `22434aa` by @dependabot in #4089
* Fix the error that Kani panics when there is no external parameter in quantifier's closure. by @thanhnguyen-aws in #4088
* Update toolchain to 2025-05-22 by @carolynzech in #4098
* Use our toolchain when invoking `cargo metadata` by @carolynzech in #4090
* Automatic toolchain upgrade to nightly-2025-05-23 by @github-actions in #4099
* Automatic toolchain upgrade to nightly-2025-05-24 by @github-actions in #4101
* Automatic toolchain upgrade to nightly-2025-05-25 by @github-actions in #4102
* Fix a bug codegening `SwitchInt`s with only an otherwise branch by @bkirwi in #4095
* Automatic toolchain upgrade to nightly-2025-05-26 by @github-actions in #4104
* Automatic cargo update to 2025-05-26 by @github-actions in #4105
* Bump tests/perf/s2n-quic from `22434aa` to `550afb3` by @dependabot in #4106
* Automatic toolchain upgrade to nightly-2025-05-27 by @github-actions in #4107
* Update `kani::mem` pointer validity documentation by @carolynzech in #4092
* Add support for edition 2018 crates using assert! (Fixes #3717) by @sintemal in #4096
* Automatic toolchain upgrade to nightly-2025-05-28 by @github-actions in #4113
* Automatic toolchain upgrade to nightly-2025-05-29 by @github-actions in #4115
* Automatic toolchain upgrade to nightly-2025-05-30 by @github-actions in #4118
* Handle generic defaults in BoundedArbitrary derives by @zhassan-aws in #4117
* Automatic cargo update to 2025-06-02 by @github-actions in #4121
* Bump tests/perf/s2n-quic from `550afb3` to `8f54b57` by @dependabot in #4122
* Upgrade Rust toolchain to 2025-06-02 by @zhassan-aws in #4123
* Automatic toolchain upgrade to nightly-2025-06-03 by @github-actions in #4125
* Finish deprecating `--enable-unstable`, `--restrict-vtable`, and `--write-json-symtab` by @carolynzech in #4110
* `ty_mangled_name`: only use non-mangled name if `-Zcffi` is enabled. by @carolynzech in #4114
* Improve Help Menu by @carolynzech in #4109
* Start stabilizing `--jobs` and `list`; deprecate default memory checks by @carolynzech in #4108
* Refactor simd_bitmask to reduce the number of iterations by @zhassan-aws in #4129
* Set target features depending on the target architecture by @zhassan-aws in #4127
* Bump some versions suggested by cargo-outdated by @zhassan-aws in #4131
* Improve linking error output for `#[no_std]` crates by @AlexanderPortland in #4126
* Fix the git log command in the toolchain update script by @zhassan-aws in #4139
* Gate quantifiers behind an experimental feature by @thanhnguyen-aws in #4141
* Automatic cargo update to 2025-06-09 by @github-actions in #4145

## New Contributors
* @bkirwi made their first contribution in #4095
* @sintemal made their first contribution in #4096
* @AlexanderPortland made their first contribution in #4126

**Full Changelog**: kani-0.62.0...kani-0.63.0
```


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.

---------

Co-authored-by: Carolyn Zech <carolynzech@gmail.com>
zhassan-aws added a commit that referenced this pull request Jul 23, 2025
This PR is co-authored with @tautschnig.

Background: In #4127 we
enabled certain target features that are platform dependent (e.g. `sse`
and `sse2` for x86_64 and `neon` for aarch64) which resulted in using
the `simd_bitmask` intrinsic more frequently. Kani's current model of
that intrinsic
(https://github.com/model-checking/kani/blob/5f14b735b74f3ae3f9f1c64ce5656e1e735d42ea/library/kani/src/models/mod.rs#L72)
uses a loop that iterates `LANE` times, which requires harnesses that
touch this code (e.g. that use the `HashSet` data structure) to have a
large unwinding value, which results in a significant slowdown.

This PR refactors this function through rewriting it using a nested loop
and manually unwinding the inner loop (which operates over the bits in a
byte) so that a large unwinding value is not needed.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-EndToEndBenchCI Tag a PR to run benchmark CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants