-
Notifications
You must be signed in to change notification settings - Fork 37.7k
addrman: improve performance by using more suitable containers #18722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Full output from the benchmark (
raw output
|
Would it be possible to submit the benchmark upfront as a separate pull, so that https://codespeed.bitcoinperf.com/ can start running it? |
Done: #18754 |
Removed the first commit (which introduced the benchmark), now that it is already merged into |
2225590
to
7cc3172
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
`CAddrMan` uses `std::map` internally even though it does not require that the map's elements are sorted. `std::map`'s access time is `O(log(map size))`. `std::unordered_map` is more suitable as it has a `O(1)` access time. This patch lowers the execution times of `CAddrMan`'s methods as follows (as per `src/bench/addrman.cpp`): ``` AddrMan::Add(): -3.5% AddrMan::GetAddr(): -76% AddrMan::Good(): -0.38% AddrMan::Select(): -45% ``` Github-Pull: bitcoin#18722 Rebased-From: 7cc3172
utACK 7cc3172 |
Concept ACK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarking results:
- master (8c97780)
$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 1.15889, 0.0458257, 0.0469068, 0.0464069
AddrManGetAddr, 5, 500, 1.94467, 0.000772949, 0.000789915, 0.000775272
AddrManGood, 5, 2, 5.69597, 0.563135, 0.57652, 0.570197
AddrManSelect, 5, 1000000, 1.17994, 2.32975e-07, 2.37942e-07, 2.36478e-07
- master + this PR
$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 0.972042, 0.0383488, 0.0396571, 0.0386554
AddrManGetAddr, 5, 500, 0.649199, 0.000256967, 0.000262957, 0.00025942
AddrManGood, 5, 2, 5.35751, 0.531294, 0.540723, 0.534681
AddrManSelect, 5, 1000000, 0.657766, 1.29313e-07, 1.36428e-07, 1.30873e-07
Nice :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If #include
s in addrman.h
are already touched, mind moving
#include <fs.h>
#include <hash.h>
#include <streams.h>
into the first group?
Good catch, they should be moved! Given that there is already one ACK, I will move the headers if I alter this PR or as a followup PR if this gets merged as is now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/netaddress.h
Outdated
CSipHasher hasher(salt_k0, salt_k1); | ||
hasher.Write(a.ip, sizeof(a.ip)); | ||
return (size_t)hasher.Finalize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: All calls could be chained:
CSipHasher hasher(salt_k0, salt_k1); | |
hasher.Write(a.ip, sizeof(a.ip)); | |
return (size_t)hasher.Finalize(); | |
return CSipHasher(salt_k0, salt_k1).Write(a.ip, sizeof(a.ip)).Finalize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually find the 3-line variant easier to read (that could be subjective). The 3-line variant is more manageable wrt setting a breakpoint or reading a backtrace (e.g. if you only see it crashed on that multi-chained-line it wouldn't be clear whether it crashed in the constructor, in the Write()
method or in the Finalize()
method).
I would prefer to leave it as is.
Prefixed |
|
src/netaddress.h
Outdated
size_t operator()(const CNetAddr& a) const noexcept | ||
{ | ||
CSipHasher hasher(m_salt_k0, m_salt_k1); | ||
hasher.Write(reinterpret_cast<const uint8_t*>(&a.m_net), sizeof(a.m_net)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Micronit: this reinterpret cast is a bit ugly. You could avoid it by just doing hasher.Write(a.m_net);
; it'll interpret a.m_net
as a uint64_t and use the optimized integer hasher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I hope the implicit conversion from int
to uint64_t
will not upset some compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ugliness could maybe be considered a virtue; it's explicit that this is an "on your head be it!" type of cast and highlights that the code could ideally be improved. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we can do without a cast since there is Write()
method that takes an integer argument (modulo int
-> uint64
conversion which is ok because we never pass negative values).
utACK 6a1a28f |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK c22e98b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK c22e98b
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1f4ee9c
`CAddrMan` uses `std::map` internally even though it does not require that the map's elements are sorted. `std::map`'s access time is `O(log(map size))`. `std::unordered_map` is more suitable as it has a `O(1)` access time. This patch lowers the execution times of `CAddrMan`'s methods as follows (as per `src/bench/addrman.cpp`): ``` AddrMan::Add(): -3.5% AddrMan::GetAddr(): -76% AddrMan::Good(): -0.38% AddrMan::Select(): -45% ```
|
ACK a92485b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK a92485b |
CAddrMan
usesstd::map
internally even though it does not requirethat the map's elements are sorted.
std::map
's access time isO(log(map size))
.std::unordered_map
is more suitable as it has aO(1)
access time.This patch lowers the execution times of
CAddrMan
's methods as follows(as per
src/bench/addrman.cpp
):