-
Notifications
You must be signed in to change notification settings - Fork 37.7k
addrman: Improve performance of Good #22974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @vasild |
Nice code reduction. Did you see a difference in |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building to bench. Some minor comments, feel free to ignore until concept acks and need to retouch.
Concept ACK. Benches are showing a huge speedup for me locally. before
after
|
Thanks! I didn't know about this benchmark, will try to confirm locally and add the result to OP! |
approach ACK, planning to do a more in-depth review later this week |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Maybe also add a commit to allow for shorter fuzz inputs: #21129 (comment)
This will allow the fuzz engine to potentially minimize further on crashes, or produce shorter inputs in general.
Concept ACK. |
Add_(addr, source, time_penalty); | ||
|
||
if (n > 0 && mapInfo.size() % n == 0) { | ||
Good_(addr, false, GetTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated, but I was wondering if it makes sense to fuzz the time here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment before this block warns against fuzzing in this loop, some problem with running out of random numbers. Could use insecure_rand
though?
527641b
to
e77aed6
Compare
This is done by removing an unnecessary loop in Good_() and looping through the new tables in MakeTried() more efficiently, choosing a starting value that allow us to stop early in typical cases. Co-authored-by: John Newbery <john@johnnewbery.com>
After the performance improvement for Good(), the direct method is only 2x faster as opposed to 60x before.
e77aed6
to
57ce203
Compare
Thanks for the reviews - I addressed all comments. I also ran the benchmark and got similar results to @jonatack:
after
|
ACK 57ce203 |
ACK 57ce203 Using the public interface to fuzz addrman is a much better approach. As well as being fewer lines of code, the fuzz test is more correct now. The previous code reached into the internals and made changes to private data which invalidated addrman's invariants. For example, entries were placed into the tried table without setting Lines 774 to 775 in 87780df
I think the next step is to move |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 57ce203
Nice addrman code cleanup and speed-up, taking advantage of nRefCount
. Without much surprise, the relative performance gains between PR and master branch, as shown by running the benchmark (./src/bench/bench_bitcoin -filter=AddrManGood
), are similar to the results by @jonatack and @mzumsande on my machine (>150x).
master:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,333,812,989.00 | 0.75 | 3.2% | 6.66 | `AddrManGood`
| 1,289,262,095.00 | 0.78 | 0.9% | 6.47 | `AddrManGood`
| 1,339,632,431.00 | 0.75 | 1.6% | 6.72 | `AddrManGood`
| 1,311,484,130.00 | 0.76 | 0.4% | 6.50 | `AddrManGood`
| 1,326,841,389.00 | 0.75 | 1.6% | 6.59 | `AddrManGood`
PR:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 8,645,751.00 | 115.66 | 1.7% | 0.04 | `AddrManGood`
| 8,715,892.00 | 114.73 | 1.2% | 0.04 | `AddrManGood`
| 8,076,594.00 | 123.81 | 1.7% | 0.04 | `AddrManGood`
| 8,715,701.00 | 114.74 | 1.2% | 0.04 | `AddrManGood`
| 8,545,046.00 | 117.03 | 2.2% | 0.04 | `AddrManGood`
(tested on OpenBSD 6.9, amd64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 57ce203
Impressive 165x boost in the AddrManGood
benchmark (tested locally)! (2.3 op/s VS 380 op/s). Notice that it is adding entries with addr != source, so that lucky start_bucket
is not playing a role in the benchmark, I guess.
In commit acf656d fuzz: Use public interface to fill addrman tried tables
, Add_()
and Good_()
are not public interface, they are private methods.
@@ -85,7 +85,7 @@ class CAddrManDeterministic : public CAddrMan | |||
// 0, 1, 2, 3 corresponding to 0%, 100%, 50%, 33% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just above, on line 83 (github does not let me comment on it) is a comment:
// Add some of the addresses directly to the "tried" table.
I think the word "directly" does not apply anymore since the hackish direct addition is replaced by Add_()
(goes to "new") + Good_()
(moved to "tried").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix this in a follow-up!
@@ -96,31 +96,12 @@ class CAddrManDeterministic : public CAddrMan | |||
for (size_t j = 0; j < num_addresses; ++j) { | |||
const auto addr = CAddress{CService{RandAddr(), 8333}, NODE_NETWORK}; | |||
const auto time_penalty = insecure_rand.randrange(100000001); | |||
#if 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I compared this branch with the #else
branch, filled addrman with 9k addresses and did not observe any difference in performance. That is - now the high level Add_()
+ Good_()
perform the same way as the hackish direct add to the tried table 🐎.
This is fantastic, faster and less code. |
I think it does: The first
That's true - thinking about doing a (fuzz-only) refactoring follow-up, in which this could be switched to |
Suggested here: bitcoin#22974 (comment)
Moves some of the setup into the benchmark loop so it is possible to run the loop for an arbitrary number of times. Due to recent optimizations in bitcoin#22974 the benchmark now runs much faster, so the inner loop now calls Good() 32 times as often to get better numbers. Renamed the benchmark to AddrManAddThenGood because that's now what is actually tested. To get the the number of just Good(), one needs to subtract the benchmark result of AddrManAdd.
Suggested here: bitcoin#22974 (comment)
57ce203 fuzz: allow lower number of sources (Martin Zumsande) acf656d fuzz: Use public interface to fill addrman tried tables (Martin Zumsande) eb2e113 addrman: Improve performance of Good (Martin Zumsande) Pull request description: Currently, `CAddrman::Good()` is rather slow because the process of moving an addr from new to tried involves looping over the new tables twice: 1) In `Good_()`, there is a loop searching for a new bucket the addr is currently in, but this information is never used except for aborting if it is not found anywhere (since [this commit](bitcoin@e6b343d#diff-49d1faa58beca1ee1509a247e0331bb91f8604e30a483a7b2dea813e6cea02e2R263) it is no longer passed to `MakeTried`) This is unnecessary because in a non-corrupted addrman, an address that is not in New must be either in Tried or not at all in addrman, both cases in which we'd return early in `Good_()` and never get to this point. I removed this loop (and left a check for `nRefCount` as a belt-and-suspenders check). 2) In `MakeTried()`, which is called from `Good_()`, another loop removes all instances of this address from new. This can be spedup by stopping the search at `nRefCount==0`. Further reductions in `nRefCount` would only lead to an assert anyway. Moreover, the search can be started at the bucket determined by the source of the addr for which `Good` was called, so that if it is present just once in New, no further buckets need to be checked. While calls to `Good()` are not that frequent normally, the performance gain is clearly seen in the fuzz target `addman_serdeser`, where, because of the slowness in creating a decently filled addrman, a shortcut was created that would directly populate the tried tables by reaching into addrman's internals, bypassing `Good()` (bitcoin#21129). I removed this workaround in the second commit: Using `Good()` is still slower by a factor of 2 (down from a factor of ~60 before), but I think that this compensated by the advantages of not having to reach into the internal structures of addrman (see jnewbery#18 (comment)). [Edit]: For benchmark results see bitcoin#22974 (comment) and bitcoin#22974 (comment) - the benchmark `AddrManGood` shows a significant speedup by a factor >100. ACKs for top commit: naumenkogs: ACK 57ce203 jnewbery: ACK 57ce203 laanwj: Code review ACK 57ce203 theStack: ACK 57ce203 vasild: ACK 57ce203 Tree-SHA512: fb6dfc198f2e28bdbb41cef9709828f22d83b4be0e640a3155ca42e771b6f58466de1468f54d773e794f780a79113f9f7d522032e87fdd75bdc4d99330445198
Suggested here: bitcoin#22974 (comment)
Suggested here: bitcoin#22974 (comment)
Suggested here: bitcoin#22974 (comment)
Suggested here: bitcoin#22974 (comment)
4445211 [fuzz] Update comment in FillAddrman() (John Newbery) 640476e [fuzz] Make Fill() a free function in fuzz/addrman.cpp (John Newbery) 90ad8ad [fuzz] Make RandAddr() a free function in fuzz/addrman.cpp (John Newbery) 491975c [fuzz] Pass FuzzedDataProvider& into Fill() in addrman fuzz tests (John Newbery) 56303e3 [fuzz] Create a FastRandomContext in addrman fuzz tests (John Newbery) Pull request description: #22974 improved the performance of `CAddrMan::Good()` significantly so that it could be used directly in the fuzz tests, instead of those tests reaching directly into addrman's internal data structures. This PR continues the work of making the fuzz tests only use `CAddrMan`'s public methods and pulls the `Fill()` and `RandAddr()` methods from `CAddrManDeterministic` into free functions, since they no longer need access to `CAddrMan` internals. ACKs for top commit: theuni: utACK 4445211. Agree the failure seems unrelated, looks like some startup race. mzumsande: ACK 4445211 vasild: ACK 4445211 Tree-SHA512: fcf994e1dedd0012b77f632720b6423d51ceda4eb85c9efe572f2a1150117f9e511114a5206738dd94409137287577f3b01a9998f5237de845410d3d96e7cb7f
4445211 [fuzz] Update comment in FillAddrman() (John Newbery) 640476e [fuzz] Make Fill() a free function in fuzz/addrman.cpp (John Newbery) 90ad8ad [fuzz] Make RandAddr() a free function in fuzz/addrman.cpp (John Newbery) 491975c [fuzz] Pass FuzzedDataProvider& into Fill() in addrman fuzz tests (John Newbery) 56303e3 [fuzz] Create a FastRandomContext in addrman fuzz tests (John Newbery) Pull request description: bitcoin#22974 improved the performance of `CAddrMan::Good()` significantly so that it could be used directly in the fuzz tests, instead of those tests reaching directly into addrman's internal data structures. This PR continues the work of making the fuzz tests only use `CAddrMan`'s public methods and pulls the `Fill()` and `RandAddr()` methods from `CAddrManDeterministic` into free functions, since they no longer need access to `CAddrMan` internals. ACKs for top commit: theuni: utACK 4445211. Agree the failure seems unrelated, looks like some startup race. mzumsande: ACK 4445211 vasild: ACK 4445211 Tree-SHA512: fcf994e1dedd0012b77f632720b6423d51ceda4eb85c9efe572f2a1150117f9e511114a5206738dd94409137287577f3b01a9998f5237de845410d3d96e7cb7f
Moves some of the setup into the benchmark loop so it is possible to run the loop for an arbitrary number of times. Due to recent optimizations in bitcoin#22974 the benchmark now runs much faster, so the inner loop now calls Good() 32 times as often to get better numbers. Renamed the benchmark to AddrManAddThenGood because that's now what is actually tested. To get the the number of just Good(), one needs to subtract the benchmark result of AddrManAdd.
Suggested here: bitcoin#22974 (comment)
Summary: > Currently, CAddrman::Good() is rather slow because the process of moving an addr from new to tried involves looping over the new tables twice: > > - In Good_(), there is a loop searching for a new bucket the addr is currently in, but this information is never used except for aborting if it is not found anywhere (since this commit it is no longer passed to MakeTried) > This is unnecessary because in a non-corrupted addrman, an address that is not in New must be either in Tried or not at all in addrman, both cases in which we'd return early in Good_() and never get to this point. > I removed this loop (and left a check for nRefCount as a belt-and-suspenders check). > > - In MakeTried(), which is called from Good_(), another loop removes all instances of this address from new. This can be spedup by stopping the search at nRefCount==0. Further reductions in nRefCount would only lead to an assert anyway. > Moreover, the search can be started at the bucket determined by the source of the addr for which Good was called, so that if it is present just once in New, no further buckets need to be checked. Improving performance of Good_ is done by removing an unnecessary loop in Good_() and looping through the new tables in MakeTried() more efficiently, choosing a starting value that allow us to stop early in typical cases. Co-authored-by: John Newbery <john@johnnewbery.com> This is a partial backport of [[bitcoin/bitcoin#22974 | core#22974]] bitcoin/bitcoin@eb2e113 Depends on D10855 The other two commits are not applicable due to a large stack of missing fuzz backports. Test Plan: `ninja all check-all` Benchmark: `./src/bench/bitcoin-bench -filter=AddrManGood ` Before: ``` | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 96,564,809.00 | 10.36 | 0.2% | 0.49 | `AddrManGood` | 100,490,917.00 | 9.95 | 0.8% | 0.51 | `AddrManGood` | 99,080,029.00 | 10.09 | 0.9% | 0.50 | `AddrManGood` ``` After: ``` | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 1,355,126.00 | 737.94 | 2.1% | 0.01 | `AddrManGood` | 1,330,379.00 | 751.67 | 5.7% | 0.01 | 〰️ `AddrManGood` (Unstable with ~1.0 iters. Increase `minEpochIterations` to e.g. 10) | 859,394.00 | 1,163.61 | 1.3% | 0.00 | `AddrManGood` ``` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D10857
Currently,
CAddrman::Good()
is rather slow because the process of moving an addr from new to tried involves looping over the new tables twice:In
Good_()
, there is a loop searching for a new bucket the addr is currently in, but this information is never used except for aborting if it is not found anywhere (since this commit it is no longer passed toMakeTried
)This is unnecessary because in a non-corrupted addrman, an address that is not in New must be either in Tried or not at all in addrman, both cases in which we'd return early in
Good_()
and never get to this point.I removed this loop (and left a check for
nRefCount
as a belt-and-suspenders check).In
MakeTried()
, which is called fromGood_()
, another loop removes all instances of this address from new. This can be spedup by stopping the search atnRefCount==0
. Further reductions innRefCount
would only lead to an assert anyway.Moreover, the search can be started at the bucket determined by the source of the addr for which
Good
was called, so that if it is present just once in New, no further buckets need to be checked.While calls to
Good()
are not that frequent normally, the performance gain is clearly seen in the fuzz targetaddman_serdeser
, where, because of the slowness in creating a decently filled addrman, a shortcut was created that would directly populate the tried tables by reaching into addrman's internals, bypassingGood()
(#21129).I removed this workaround in the second commit: Using
Good()
is still slower by a factor of 2 (down from a factor of ~60 before), but I think that this compensated by the advantages of not having to reach into the internal structures of addrman (see jnewbery#18 (comment)).[Edit]: For benchmark results see #22974 (comment) and #22974 (comment) - the benchmark
AddrManGood
shows a significant speedup by a factor >100.