Skip to content

Conversation

vasild
Copy link
Contributor

@vasild vasild commented Apr 21, 2020

CAddrMan uses std::map internally even though it does not require
that the map's elements are sorted. std::map's access time is
O(log(map size)). std::unordered_map is more suitable as it has a
O(1) access time.

This patch lowers the execution times of CAddrMan's methods as follows
(as per src/bench/addrman.cpp):

AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%

@vasild
Copy link
Contributor Author

vasild commented Apr 21, 2020

Full output from the benchmark (./src/bench/bench_bitcoin -filter="AddrMan.*"):

std::map

Benchmark evals iterations total min max median
AddrManAdd 5 5 4.45807 0.176704 0.179945 0.178443
AddrManGetAddr 5 500 4.09847 0.00163663 0.00164301 0.0016392
AddrManGood 5 2 4.66502 0.460193 0.474046 0.467368
AddrManSelect 5 1000000 2.07299 4.12892e-07 4.15564e-07 4.147e-07

std::unordered_map

Benchmark evals iterations total min max median
AddrManAdd 5 5 4.30032 0.169753 0.1738 0.172047
AddrManGetAddr 5 500 0.967398 0.000384406 0.00038996 0.000387176
AddrManGood 5 2 4.64742 0.460599 0.471216 0.464143
AddrManSelect 5 1000000 1.14085 2.23899e-07 2.36962e-07 2.26767e-07
raw output
map$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 4.45807, 0.176704, 0.179945, 0.178443
AddrManGetAddr, 5, 500, 4.09847, 0.00163663, 0.00164301, 0.0016392
AddrManGood, 5, 2, 4.66502, 0.460193, 0.474046, 0.467368
AddrManSelect, 5, 1000000, 2.07299, 4.12892e-07, 4.15564e-07, 4.147e-07

unordered_map$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 4.30032, 0.169753, 0.1738, 0.172047
AddrManGetAddr, 5, 500, 0.967398, 0.000384406, 0.00038996, 0.000387176
AddrManGood, 5, 2, 4.64742, 0.460599, 0.471216, 0.464143
AddrManSelect, 5, 1000000, 1.14085, 2.23899e-07, 2.36962e-07, 2.26767e-07

@fanquake fanquake added the P2P label Apr 21, 2020
@vasild vasild changed the title addrman: use unordered_map instead of map addrman: improve performance by using more suitable containers Apr 21, 2020
@fanquake fanquake requested a review from sipa April 22, 2020 08:45
@maflcko
Copy link
Member

maflcko commented Apr 22, 2020

Would it be possible to submit the benchmark upfront as a separate pull, so that https://codespeed.bitcoinperf.com/ can start running it?

@vasild
Copy link
Contributor Author

vasild commented Apr 24, 2020

Done: #18754

@vasild
Copy link
Contributor Author

vasild commented Apr 27, 2020

Removed the first commit (which introduced the benchmark), now that it is already merged into master and rebased.

@vasild vasild force-pushed the addrman_map branch 2 times, most recently from 2225590 to 7cc3172 Compare May 5, 2020 09:50
@DrahtBot
Copy link
Contributor

DrahtBot commented May 20, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Jun 9, 2020
`CAddrMan` uses `std::map` internally even though it does not require
that the map's elements are sorted. `std::map`'s access time is
`O(log(map size))`. `std::unordered_map` is more suitable as it has a
`O(1)` access time.

This patch lowers the execution times of `CAddrMan`'s methods as follows
(as per `src/bench/addrman.cpp`):

```
AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%
```

Github-Pull: bitcoin#18722
Rebased-From: 7cc3172
@sipa
Copy link
Member

sipa commented Jun 10, 2020

utACK 7cc3172

@hebasto
Copy link
Member

hebasto commented Jun 12, 2020

Concept ACK.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarking results:

$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 1.15889, 0.0458257, 0.0469068, 0.0464069
AddrManGetAddr, 5, 500, 1.94467, 0.000772949, 0.000789915, 0.000775272
AddrManGood, 5, 2, 5.69597, 0.563135, 0.57652, 0.570197
AddrManSelect, 5, 1000000, 1.17994, 2.32975e-07, 2.37942e-07, 2.36478e-07
  • master + this PR
$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 0.972042, 0.0383488, 0.0396571, 0.0386554
AddrManGetAddr, 5, 500, 0.649199, 0.000256967, 0.000262957, 0.00025942
AddrManGood, 5, 2, 5.35751, 0.531294, 0.540723, 0.534681
AddrManSelect, 5, 1000000, 0.657766, 1.29313e-07, 1.36428e-07, 1.30873e-07

Nice :)

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If #includes in addrman.h are already touched, mind moving

#include <fs.h>
#include <hash.h>
#include <streams.h>

into the first group?

@vasild
Copy link
Contributor Author

vasild commented Jun 15, 2020

nit: If #includes in addrman.h are already touched, mind moving...

Good catch, they should be moved! Given that there is already one ACK, I will move the headers if I alter this PR or as a followup PR if this gets merged as is now.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7cc3172

@vasild feel free to ignore my nits :)

src/netaddress.h Outdated
Comment on lines 120 to 462
CSipHasher hasher(salt_k0, salt_k1);
hasher.Write(a.ip, sizeof(a.ip));
return (size_t)hasher.Finalize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: All calls could be chained:

Suggested change
CSipHasher hasher(salt_k0, salt_k1);
hasher.Write(a.ip, sizeof(a.ip));
return (size_t)hasher.Finalize();
return CSipHasher(salt_k0, salt_k1).Write(a.ip, sizeof(a.ip)).Finalize();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually find the 3-line variant easier to read (that could be subjective). The 3-line variant is more manageable wrt setting a breakpoint or reading a backtrace (e.g. if you only see it crashed on that multi-chained-line it wouldn't be clear whether it crashed in the constructor, in the Write() method or in the Finalize() method).

I would prefer to leave it as is.

@vasild
Copy link
Contributor Author

vasild commented Jun 25, 2020

Prefixed salt_k0 and salt_k1 with m_ to abide to the naming convention. diff.

@vasild
Copy link
Contributor Author

vasild commented May 21, 2021

6f0c9f5390...6a1a28f2ad: rebased due to conflicts, plus add m_net to the CNetAddr hashing.

src/netaddress.h Outdated
size_t operator()(const CNetAddr& a) const noexcept
{
CSipHasher hasher(m_salt_k0, m_salt_k1);
hasher.Write(reinterpret_cast<const uint8_t*>(&a.m_net), sizeof(a.m_net));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Micronit: this reinterpret cast is a bit ugly. You could avoid it by just doing hasher.Write(a.m_net);; it'll interpret a.m_net as a uint64_t and use the optimized integer hasher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I hope the implicit conversion from int to uint64_t will not upset some compiler.

Copy link
Member

@jonatack jonatack May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ugliness could maybe be considered a virtue; it's explicit that this is an "on your head be it!" type of cast and highlights that the code could ideally be improved. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can do without a cast since there is Write() method that takes an integer argument (modulo int -> uint64 conversion which is ok because we never pass negative values).

@sipa
Copy link
Member

sipa commented May 21, 2021

utACK 6a1a28f

@vasild
Copy link
Contributor Author

vasild commented May 24, 2021

6a1a28f2ad...c22e98b30d: address review suggestion

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK c22e98b

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK c22e98b

@vasild
Copy link
Contributor Author

vasild commented May 28, 2021

c22e98b30d...1f4ee9c905: rebase due to conflicts and address review suggestion

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 1f4ee9c

`CAddrMan` uses `std::map` internally even though it does not require
that the map's elements are sorted. `std::map`'s access time is
`O(log(map size))`. `std::unordered_map` is more suitable as it has a
`O(1)` access time.

This patch lowers the execution times of `CAddrMan`'s methods as follows
(as per `src/bench/addrman.cpp`):

```
AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%
```
@vasild
Copy link
Contributor Author

vasild commented May 28, 2021

1f4ee9c905...a92485b2c2: address review suggestion

@jonatack
Copy link
Member

ACK a92485b

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK a92485b, only suggested changes and rebased since my previous review.

@achow101
Copy link
Member

ACK a92485b

@fanquake fanquake merged commit 1a369f0 into bitcoin:master Jun 12, 2021
@vasild vasild deleted the addrman_map branch June 13, 2021 07:14
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 13, 2021
gwillen pushed a commit to ElementsProject/elements that referenced this pull request Jun 1, 2022
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.