-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Add missing byteswap functions for MSVC #29036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This should give a significant speedup across the board.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Converted this to draft because I haven't actually tested the compile for MSVC (relying on c-i :( ), and because we'll need solid benchmarks before making any decisions. |
Concept ACK on removing another MSVC-specific performance pessimization from the code. |
It compiles OK.
What subset of benchmarks we might consider representative enough? For now, I'm observing very tiny (a couple of %) improvements in some of them. Also we should remember that MSVC build uses no optimisation (see e94ae81). |
cc @sipsorcery.
Could rebase on master now that #29045 has gone in. |
Builds fine for me on Windows 11 and msvc v19.38.33130 x64 (Visual Studio 2022). I ran |
See #28674 (comment) for benches that could make a difference. But this requires a rebase on master first, see
|
Obsoleted by #29263. |
86b7f28 serialization: use internal endian conversion functions (Cory Fields) 432b18c serialization: detect byteswap builtins without autoconf tests (Cory Fields) 297367b crypto: replace CountBits with std::bit_width (Cory Fields) 52f9bba crypto: replace non-standard CLZ builtins with c++20's bit_width (Cory Fields) Pull request description: This replaces #28674, #29036, and #29057. Now ready for testing and review. Replaces platform-specific endian and byteswap functions. This is especially useful for kernel, as it means that our deep serialization code no longer requires bitcoin-config.h. I apologize for the size of the last commit, but it's hard to avoid making those changes at once. All platforms now use our internal functions rather than libc or platform-specific ones, with the exception of MSVC. Sadly, benchmarking showed that not all compilers are capable of detecting and optimizing byteswap functions, so compiler builtins are instead used where possible. However, they're now detected via macros rather than autoconf checks. This[ matches how libc++ implements std::byteswap for c++23](https://github.com/llvm/llvm-project/blob/main/libcxx/include/__bit/byteswap.h#L26). I suggest we move/rename `compat/endian.h`, but I left that out of this PR to avoid bikeshedding. #29057 pointed out some irregularities in benchmarks. After messing with various compilers and configs for a few weeks with these changes, I'm of the opinion that we can't win on every platform every time, so we should take the code that makes sense going forward. That said, if any real-world slowdowns are caused here, we should obviously investigate. ACKs for top commit: maflcko: ACK 86b7f28 📘 fanquake: ACK 86b7f28 - we can finish pruning out the __builtin_clz* checks/usage once the minisketch code has been updated. This is more good cleanup pre-CMake & for the kernal. Tree-SHA512: 715a32ec190c70505ffbce70bfe81fc7b6aa33e376b60292e801f60cf17025aabfcab4e8c53ebb2e28ffc5cf4c20b74fe3dd8548371ad772085c13aec8b7970e
This should give a speedup across the board for MSVC builds.
While working on modernizing our byteswapping code for c++20, we noticed that MSVC uses our hand-written byteswap functions, as opposed to using libc/compiler versions like almost all other platforms.
aureleoules did some great benchmarks in #28674 which show that these hand-written byteswaps often compile down to a slow mess.
hebasto confirmed that we're indeed hitting these paths for MSVC.
Quick tests with godbolt show that MSVC's provided
_byteswap_*
indeed speed things up.