Skip to content

Conversation

fanquake
Copy link
Member

@fanquake fanquake commented Apr 11, 2024

We can cherry-pick one commit from upstream leveldb, make the same change in crc32c, and then ultimately drop our build infra for testing endianness.

Not for merging until subtrees are updated:

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 11, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29852.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK theuni, hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@theuni
Copy link
Member

theuni commented Apr 11, 2024

Concept ACK. Looking at the leveldb godbolt link, this is nicely optimized everywhere except MSVC. I'm ok with a possible regression there for the sake of the cleanup.

@theuni
Copy link
Member

theuni commented Apr 11, 2024

Want to upstream the crc32 patch to match the others we have sitting there?

@hebasto
Copy link
Member

hebasto commented Apr 11, 2024

Concept ACK.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/23709865964

@hebasto
Copy link
Member

hebasto commented Apr 14, 2024

Concept ACK. Looking at the leveldb godbolt link, this is nicely optimized everywhere except MSVC.

However, the changes in MSVC generated assembly code look quite significant.

I'm ok with a possible regression there for the sake of the cleanup.

I disagree. Before stacking another performance deterioration change on top of the pile of the currently unresolved performance issues in the MSVC builds, it would be nice to compare benchmarks in the first place.

@fanquake
Copy link
Member Author

However, the changes in MSVC generated assembly code look quite significant.
Before stacking another performance deterioration change on top of the pile

Isn't that because optimisations haven't been turned on? Otherwise, can you provide a concrete example of what you're talking about.

@fanquake
Copy link
Member Author

Want to upstream the crc32 patch to match the others we have sitting there?

Sure. Opened a PR in our crc32c subtree fork: bitcoin-core/crc32c-subtree#7, and one in Google upstream: google/crc32c#64.

@hebasto
Copy link
Member

hebasto commented Apr 16, 2024

However, the changes in MSVC generated assembly code look quite significant.
Before stacking another performance deterioration change on top of the pile

Isn't that because optimisations haven't been turned on? Otherwise, can you provide a concrete example of what you're talking about.

https://godbolt.org/z/of4T8hM8j provides examples with the /O2 optimization flag.

@hebasto
Copy link
Member

hebasto commented Apr 16, 2024

What benchmarks might be appropiate for testing changes like these?

@maflcko
Copy link
Member

maflcko commented Apr 16, 2024

What benchmarks might be appropiate for testing changes like these?

Microbenchmarks + IBD?

@theuni
Copy link
Member

theuni commented Apr 17, 2024

@hebasto Is there a venue for reporting this to MSVC? They recently patted themselves on the back for detecting similar patterns. It's a shame MSVC can't detect something that (in 2024) seems so obvious.

@hebasto
Copy link
Member

hebasto commented Apr 17, 2024

Is there a venue for reporting this to MSVC? They recently patted themselves on the back for detecting similar patterns. It's a shame MSVC can't detect something that (in 2024) seems so obvious.

cc @sipsorcery

@sipsorcery
Copy link
Contributor

Is there a venue for reporting this to MSVC? They recently patted themselves on the back for detecting similar patterns. It's a shame MSVC can't detect something that (in 2024) seems so obvious.

cc @sipsorcery

Most likely fruitless but can't hurt to ask.

https://x.com/sipsorcery/status/1780687316522266853

image

@hebasto
Copy link
Member

hebasto commented Apr 19, 2024

We can cherry-pick one commit from upstream leveldb, make the same change in crc32c, and then ultimately drop our build infra for testing endianness.

The same goal, which is dropping "build infra for testing endianness", might be achieved with an alternative approach, which essentially boils down to:

--- a/src/leveldb/util/coding.h
+++ b/src/leveldb/util/coding.h
@@ -62,7 +62,7 @@ char* EncodeVarint64(char* dst, uint64_t value);
 inline void EncodeFixed32(char* dst, uint32_t value) {
   uint8_t* const buffer = reinterpret_cast<uint8_t*>(dst);
 
-  if (port::kLittleEndian) {
+  if constexpr (std::endian::native == std::endian::little) {
     // Fast path for little-endian CPUs. All major compilers optimize this to a
     // single mov (x86_64) / str (ARM) instruction.
     std::memcpy(buffer, &value, sizeof(uint32_t));

And no MSVC code degradation :)

@fanquake
Copy link
Member Author

fanquake commented Apr 19, 2024

I don't think that's better. We have to keep all the redundant code, and now we are diverging from upstream for no reason.

@fanquake
Copy link
Member Author

Also, you're just moving the endianess testing into the code. The point is to drop all of this, and use generic code that doesn't require any tests at all. The fact that MSVC fails to perform basic optimisations is annoying, but I don't really see why it's a blocker here. If we'd just pulled the subtree, and this change came in as part of that, I doubt anyone would have even noticed anything MSVC related (still haven't seen any benchmarks showing any meaningful difference for this change as-is)?

@fanquake
Copy link
Member Author

Rebased, and switched the changes to CMake.

@theuni
Copy link
Member

theuni commented Dec 5, 2024

if constexpr (std::endian::native == std::endian::little) {

This is a c++20 feature unfortunately. So I don't imagine either upstream accepting it any time soon.

I agree with @fanquake that we shouldn't let MSVC (an unsupported and closed-source compiler) stand in the way of our progress. And this is a real barrier to us staying in sync with upstream. If we shipped msvc-built binaries that'd be one thing, but I don't see that ever happening.

@B5OFT

This comment was marked as spam.

fanquake added a commit to bitcoin-core/leveldb-subtree that referenced this pull request Jan 15, 2025
2e3c013 Remove leveldb::port::kLittleEndian. (Victor Costan)

Pull request description:

  Clang 10 includes the optimizations described in https://bugs.llvm.org/show_bug.cgi?id=41761. This means that the platform-independent implementations of {Decode,Encode}Fixed{32,64}() compile to one instruction on the most recent Clang and GCC.

  PiperOrigin-RevId: 306330166

  Cherry-picked from upstream: google@201f522.
  Currently part of bitcoin/bitcoin#29852.

ACKs for top commit:
  maflcko:
    cherry-pick ACK 2e3c013, but no strong opinion
  dergoegge:
    ACK 2e3c013

Tree-SHA512: 34de3c49f666f1810d583e091f068ab52fc2555914018e655feb5274216a2813e19d582111766f77f5af9dee7b02d0edcbdfec93f6185cc0e828a1beeb2370ac
@maflcko
Copy link
Member

maflcko commented Jan 15, 2025

MSVC

Has someone reported the request to them? If not, it seems less likely they'll fix it.

@hebasto
Copy link
Member

hebasto commented Jan 21, 2025

MSVC

Has someone reported the request to them? If not, it seems less likely they'll fix it.

https://developercommunity.visualstudio.com/t/Missed-optimization-for-consecutive-byte/10831400

@theuni

... MSVC (an unsupported ... compiler)...

Since when?

@maflcko
Copy link
Member

maflcko commented Jan 21, 2025

MSVC

Has someone reported the request to them? If not, it seems less likely they'll fix it.

https://developercommunity.visualstudio.com/t/Missed-optimization-for-consecutive-byte/10831400

Thx. Could also add a link (google/leveldb@201f522) to the thread to give one example that the code is used in the real world by a real software project?

@hebasto
Copy link
Member

hebasto commented Jan 21, 2025

MSVC

Has someone reported the request to them? If not, it seems less likely they'll fix it.

https://developercommunity.visualstudio.com/t/Missed-optimization-for-consecutive-byte/10831400

Thx. Could also add a link (google/leveldb@201f522) to the thread to give one example that the code is used in the real world by a real software project?

https://developercommunity.visualstudio.com/t/Missed-optimization-for-consecutive-byte/10831400#T-N10831434

fanquake added a commit that referenced this pull request Jan 21, 2025
910a11f build: remove LEVELDB_IS_BIG_ENDIAN (fanquake)
d336b7a Squashed 'src/leveldb/' changes from 688561cba8..04b5790928 (fanquake)

Pull request description:

  Includes:
  * bitcoin-core/leveldb-subtree#40 (used in #29852)
  * bitcoin-core/leveldb-subtree#45
  * bitcoin-core/leveldb-subtree#46

ACKs for top commit:
  kevkevinpal:
    Concept ACK [910a11f](910a11f)
  l0rinc:
    ACK 910a11f
  hebasto:
    ACK 910a11f, I've performed a subtree update locally and got the same changes.
  theuni:
    utACK 910a11f

Tree-SHA512: c5a2224c67d3fd598bc682589b805c324abf91003032a85764766048030285f56154779f29d3f0b3673c8f7f497ae62de5fc6b95ef0b022c873750053c7d27d5
@fanquake fanquake force-pushed the leveldb_cherrypicks branch from f770a75 to e34c793 Compare January 21, 2025 17:05
@fanquake fanquake changed the title [WIP] build: remove need to test for endianness build: remove need to test for endianness Jan 21, 2025
@l0rinc
Copy link
Contributor

l0rinc commented Jan 23, 2025

Microbenchmarks + IBD?

I ran a reindex-chainstate until 880k - to check for correctness and speed.
There was no obvious difference between runs (2 master vs 2 this PR).

Details
hyperfine \
--runs 2 \
--parameter-list COMMIT 523520f8279987cd528a9e2db6db13dc56641eff,72aa32fbae34c6fe151d3ab531974d6992dd065e \
--prepare 'rm -f /mnt/my_storage/BitcoinData/debug.log && git checkout {COMMIT} && git clean -fxd && git reset --hard \
&& cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_UTIL=OFF -DBUILD_TX=OFF -DBUILD_TESTS=OFF -DENABLE_WALLET=OFF -DINSTALL_MAN=OFF && cmake --build build -j$(nproc)' \
--cleanup 'mv /mnt/my_storage/BitcoinData/debug.log /mnt/my_storage/logs/debug-{COMMIT}.log' \
'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -reindex-chainstate -connect=0'

Benchmark 1: COMMIT=523520f8279987cd528a9e2db6db13dc56641eff ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -reindex-chainstate -connect=0
  Time (mean ± σ):     23763.799 s ± 230.232 s    [User: 37683.470 s, System: 693.079 s]
  Range (min … max):   23601.001 s … 23926.598 s    2 runs
 
Benchmark 2: COMMIT=72aa32fbae34c6fe151d3ab531974d6992dd065e ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -reindex-chainstate -connect=0
  Time (mean ± σ):     23804.995 s ± 97.138 s    [User: 37743.594 s, System: 690.682 s]
  Range (min … max):   23736.308 s … 23873.682 s    2 runs
 
Summary
  COMMIT=523520f8279987cd528a9e2db6db13dc56641eff ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -reindex-chainstate -connect=0 ran
    1.00 ± 0.01 times faster than COMMIT=72aa32fbae34c6fe151d3ab531974d6992dd065e ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=880000 -dbcache=30000 -reindex-chainstate -connect=0

I did however struggle to run these tests in a simulated big endian env (tried the setup I documented at #31344 (comment)), I'm getting:

96% tests passed, 6 tests failed out of 136

Total Test time (real) = 283.13 sec

The following tests FAILED:
          1 - util_test_runner (Failed)
          3 - univalue_test (Not Run)
          4 - univalue_object_test (Not Run)
          5 - secp256k1_noverify_tests (Not Run)
          6 - secp256k1_tests (Not Run)
          7 - secp256k1_exhaustive_tests (Not Run)

Not sure if this passes for latest master (it did, ~2 months ago, on my previous laptop) - will experiment more, but these runs are extremely slow.

@maflcko
Copy link
Member

maflcko commented Jan 23, 2025

Not sure if this passes for latest master (it did, ~2 months ago, on my previous laptop) - will experiment more, but these runs are extremely slow.

qemu is expected to be slow. For reference, there was a recent change (#31657), so this may now be easier to run on non-linux machines, but I haven't tried. Running the s390x CI task should cover the changes here.

@l0rinc
Copy link
Contributor

l0rinc commented Jan 25, 2025

Running the s390x CI task should cover the changes here

I managed to run ctest with master and this branch (I have updated the instructions in #31344 (comment)).

But running functional/test_runner.py reveals a few failures that seem related to endianness changes, e.g. rpc_bind.py fails with:

AssertionError: not({('0100007f', 19192)} == {('7f000001', 19192)})

Details
stdout:
2025-01-25T15:47:08.065000Z TestFramework (INFO): PRNG seed is: 3770749326237847101
2025-01-25T15:47:08.076000Z TestFramework (INFO): Initializing test directory /tmp/test_runner_₿_🏃_20250125_164236/rpc_bind_266
2025-01-25T15:47:08.079000Z TestFramework (INFO): Check for ipv6
2025-01-25T15:47:08.080000Z TestFramework (INFO): Check for non-loopback interface
2025-01-25T15:47:08.082000Z TestFramework (INFO): Bind test for ['127.0.0.1']
2025-01-25T15:47:08.657000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/mnt/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
    self.run_test()
  File "/mnt/bitcoin/build_dev_mode/test/functional/rpc_bind.py", line 99, in run_test
    self._run_loopback_tests()
  File "/mnt/bitcoin/build_dev_mode/test/functional/rpc_bind.py", line 110, in _run_loopback_tests
    self.run_bind_test(['127.0.0.1'], '127.0.0.1', ['127.0.0.1'],
  File "/mnt/bitcoin/build_dev_mode/test/functional/rpc_bind.py", line 45, in run_bind_test
    assert_equal(set(get_bind_addrs(pid)), set(expected))
  File "/mnt/bitcoin/test/functional/test_framework/util.py", line 77, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not({('0100007f', 19192)} == {('7f000001', 19192)})
2025-01-25T15:47:08.717000Z TestFramework (INFO): Stopping nodes

Similar to 038755784d88ce7522ac2f98e8ba138010a64f82 from leveldb.
@DrahtBot
Copy link
Contributor

🤔 There hasn't been much activity lately and the CI seems to be failing.

If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

@fanquake
Copy link
Member Author

I'll come back to this once we land the changes upstream.

@fanquake fanquake closed this May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants