-
Notifications
You must be signed in to change notification settings - Fork 37.7k
[IBD] precalculate SipHash constant salt XORs #30442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30442. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
ef4a436
to
3b7b82b
Compare
@andrewtoth, this is another tiny addition to the coincache speedup, your review would be welcome. |
I did not see any improvement in the benchmark with this change. Running commit 3b7b82b
commit 7ac4e3a
|
That's disappointing, thanks for checking, something's still off with my compiler, it seems. |
I ran |
Now that the other benchmark finished after a few days, I ran this on a Outdated resultsAfter the first run I got a warning of:
which I've fixed and also ran
./configure --enable-bench
git checkout 7ac4e3aaf345a06fdde463100c45c4295d730820 && git reset --hard
make -j$(nproc) && ./src/bench/bench_bitcoin -filter='.*Out[pP]oint.*' --min-time=10000
git checkout 3b7b82b4b0c39a38538aae2cba10bec3907c5cbf && git reset --hard
make -j$(nproc) && ./src/bench/bench_bitcoin -filter='.*Out[pP]oint.*' --min-time=10000
Resulting in: SaltedOutpointHasherBenchmark_create_set: Before Avg = 7443000.96, After Avg = 7537349.76, Increase = 1.27% SaltedOutpointHasherBenchmark_hash: Before Avg = 41877576.91, After Avg = 42667898.67, Increase = 1.89% SaltedOutpointHasherBenchmark_match: Before Avg = 16686007.45, After Avg = 16956121.52, Increase = 1.62% SaltedOutpointHasherBenchmark_mismatch: Before Avg = 12570108.79, After Avg = 12938653.98, Increase = 2.93% @andrewtoth, what do you think, can you try reproducing these results? |
Based on your benchmark results, I don't think this can be called an optimization. It seems to be worse for one benchmark, better in another, and roughly the same for the rest. The description also notes that this will not be noticeable by users. In light of that, does the motivation for this PR still hold? |
I've removed the |
0dc224b
to
8d6c6bd
Compare
🚧 At least one of the CI tasks failed. HintsMake sure to run all tests locally, according to the documentation. The failure may happen due to a number of reasons, for example:
Leave a comment here, if you need help tracking down a confusing failure. |
8d6c6bd
to
44098fe
Compare
44098fe
to
183052f
Compare
183052f
to
bc959f5
Compare
Sorry for taking so long. FWIW, Code review ACK bc959f5. Moving the magic numbers to constants seems like a good idea to me. It's good to add the benchmarks. And although the speed change is neglilible (as would be expected from factoring out ~four xor instructions), it doesn't complicate the code much either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced about the approach here: it's leaking a SipHash implementation detail (the way the initial v0..v3 values are computed) into a higher layer (util/hasher.h). I would suggest to have a |
2885e77
to
d848da9
Compare
Thanks a lot for the review @sipa. The I'm not sure why the suggested name was I pushed a separate rebase (without conflicts), to simplify review, and the requested changes (and a few more related cleanups) in https://github.com/bitcoin/bitcoin/compare/2885e7785417b57a3f3cad420196837965572258..d848da97f2e4fd0997507b2d1c2a1f4a2325193d I'd like to clean up |
d848da9
to
31e1a49
Compare
31e1a49
to
987be1b
Compare
…ash constant state Previously, only k0 and k1 were stored, causing the constant xor operations to be recomputed in every call to `SipHashUint256Extra`. This commit adds a dedicated `Uint256ExtraSipHasher` class that caches the initial state (v0-v3) and to perform the `SipHash` computation on a `uint256` (with an extra parameter), hiding the constant computation details from higher-level code and improving efficiency. This basically brings the precalculations in the `CSipHasher` constructor to the `uint256` specialized SipHash implementation. > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='SaltedOutpointHasherBench.*' -min-time=10000 > C++ compiler .......................... AppleClang 16.0.0.16000026 | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 57.27 | 17,462,299.19 | 0.1% | 11.02 | `SaltedOutpointHasherBench_create_set` | 11.24 | 88,997,888.48 | 0.3% | 11.04 | `SaltedOutpointHasherBench_hash` | 13.91 | 71,902,014.20 | 0.2% | 11.01 | `SaltedOutpointHasherBench_match` | 13.29 | 75,230,390.31 | 0.1% | 11.00 | `SaltedOutpointHasherBench_mismatch` compared to master: create_set - 17,462,299.19/17,065,922.04 - 2.3% faster hash - 88,997,888.48/83,576,684.83 - 6.4% faster match - 71,902,014.20/68,985,850.12 - 4.2% faster mismatch - 75,230,390.31/71,942,033.47 - 4.5% faster > C++ compiler .......................... GNU 13.3.0 | ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 135.38 | 7,386,349.49 | 0.0% | 1,078.19 | 486.16 | 2.218 | 119.56 | 1.1% | 11.00 | `SaltedOutpointHasherBench_create_set` | 23.67 | 42,254,558.08 | 0.0% | 247.01 | 85.01 | 2.906 | 4.00 | 0.0% | 11.00 | `SaltedOutpointHasherBench_hash` | 58.95 | 16,962,220.14 | 0.1% | 446.55 | 211.74 | 2.109 | 20.86 | 1.4% | 11.01 | `SaltedOutpointHasherBench_match` | 76.98 | 12,991,047.69 | 0.1% | 548.93 | 276.50 | 1.985 | 20.25 | 2.3% | 10.72 | `SaltedOutpointHasherBench_mismatch` compared to master: create_set - 7,386,349.49/7,312,133.16 - 1% faster hash - 42,254,558.08/41,978,882.62 - 0.6% faster match - 16,962,220.14/16,549,695.42 - 2.4% faster mismatch - 12,991,047.69/12,713,595.35 - 2% faster Co-authored-by: sipa <pieter@wuille.net>
Rebased to fix a conflict with #33116. Also dropped the benchmarking commit to make the change less scary - but kept the patch in the PRs description for anyone interested. Change is ready for review again. |
987be1b
to
656da51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK to be pinged by the bot, plan to review in a couple of weeks.
Concept ACK I haven't tested the code but I have reviewed it and it looks reasonable; i think the approach is sound and the two class design better addresses the matter. |
This change is part of [IBD] - Tracking PR for speeding up Initial Block Download
Summary
The in-memory representation of the UTXO set uses (salted) SipHash for avoiding key collision attacks.
Hashing a
uint256
key is done so often that a specialized optimization was extracted to SipHashUint256Extra. The constant salting operations were already extracted in the general case, this PR adjust the main specialization similarly.Details
Previously, only
k0
andk1
were stored, causing the constant xor operations to be recomputed in every call toSipHashUint256Extra
.This commit adds a dedicated
Uint256ExtraSipHasher
class that caches the initial state (v0-v3) and to perform theSipHash
computation on auint256
(with an extra parameter), hiding the constant computation details from higher-level code and improving efficiency (as suggested by Sipa in #30442 (comment)).Measurements
Benchmarks
SaltedOutpointHasherBench_create_set
SaltedOutpointHasherBench_hash
SaltedOutpointHasherBench_match
SaltedOutpointHasherBench_mismatch
SaltedOutpointHasherBench_create_set
SaltedOutpointHasherBench_hash
SaltedOutpointHasherBench_match
SaltedOutpointHasherBench_mismatch
compared to master:
SaltedOutpointHasherBench_create_set
SaltedOutpointHasherBench_hash
SaltedOutpointHasherBench_match
SaltedOutpointHasherBench_mismatch
SaltedOutpointHasherBench_create_set
SaltedOutpointHasherBench_hash
SaltedOutpointHasherBench_match
SaltedOutpointHasherBench_mismatch