Skip to content

Conversation

jnewbery
Copy link
Contributor

#22974 improved the performance of CAddrMan::Good() significantly so that it could be used directly in the fuzz tests, instead of those tests reaching directly into addrman's internal data structures.

This PR continues the work of making the fuzz tests only use CAddrMan's public methods and pulls the Fill() and RandAddr() methods from CAddrManDeterministic into free functions, since they no longer need access to CAddrMan internals.

@jnewbery
Copy link
Contributor Author

@vasild - this addresses your review comment here: #22974 (review)

@DrahtBot DrahtBot added the Tests label Sep 21, 2021
@jnewbery jnewbery force-pushed the 2021-09-addrman-fuzzer-public-methods2 branch from 00a2c72 to a41f097 Compare September 21, 2021 13:13
@jnewbery
Copy link
Contributor Author

Rebased

@jnewbery jnewbery force-pushed the 2021-09-addrman-fuzzer-public-methods2 branch from a41f097 to 19e2d94 Compare September 22, 2021 11:02
@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 22, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22910 ([RFC] Encapsulate asmap in NetGroupManager by jnewbery)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 19e2d94

The public Good() will do some extra locking and call Check() in comparison to Good_(). The locking should be insignificant wrt performance and the Check() should be noop due to the used consistency_check_ratio=0. Also, the public Good() will do "test before evict" which could cause some slowdown.

I did measure the timings to fill an addrman with hardcoded n=3 (33% tried), num_sources=50 and num_addresses=500 on both master and this PR. The addrman ends up with about 16k addresses.

master:

   13.493194 sec: fill
    0.534130 sec: ser
    5.664423 sec: deser
    1.330460 sec: compare

   13.488360 sec: fill
    0.529679 sec: ser
    5.632100 sec: deser
    1.321111 sec: compare

This PR:

   14.130295 sec: fill
    0.553457 sec: ser
    5.809201 sec: deser
    1.392843 sec: compare

   13.757125 sec: fill
    0.556877 sec: ser
    5.863635 sec: deser
    1.371259 sec: compare


if (n > 0 && mapInfo.size() % n == 0) {
Good_(addr, false, GetTime());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code calls Good() which calls Good_(/*test_before_evict=*/true) whereas the old code passed false. I think this is ok.

@practicalswift
Copy link
Contributor

Concept ACK

@jnewbery
Copy link
Contributor Author

I'm moving this to draft for now. It conflicts with #22950, which I think should go in first.

@jnewbery jnewbery marked this pull request as draft September 29, 2021 14:54
@jnewbery jnewbery force-pushed the 2021-09-addrman-fuzzer-public-methods2 branch from 19e2d94 to 0a1eb04 Compare October 4, 2021 14:01
@jnewbery
Copy link
Contributor Author

jnewbery commented Oct 4, 2021

Rebased on #22950. Leaving as a draft until that PR is merged.

Don't reach inside the object-under-test to use its random context.
Use a (reference) parameter instead of a data member of
CAddrManDeterministic. This will allow us to make Fill() a free function
in a later commit.

Also remove CAddrManDeterministic.m_fuzzed_data_provider since it's no
longer used.
@jnewbery jnewbery force-pushed the 2021-09-addrman-fuzzer-public-methods2 branch from 0a1eb04 to c0555ec Compare October 5, 2021 15:39
It doesn't require access to CAddrManDeterministic
Also rename it to FillAddrman and pass an addrman reference as an
argument. Change FillAddrman to only use addrman's public interface methods.
@jnewbery jnewbery force-pushed the 2021-09-addrman-fuzzer-public-methods2 branch from c0555ec to 4445211 Compare October 5, 2021 15:59
@jnewbery
Copy link
Contributor Author

jnewbery commented Oct 5, 2021

Rebased. Marking ready for review.

@jnewbery jnewbery marked this pull request as ready for review October 5, 2021 16:00
@maflcko
Copy link
Member

maflcko commented Oct 5, 2021

SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const

@amitiuttarwar
Copy link
Contributor

concept ACK

@jnewbery
Copy link
Contributor Author

jnewbery commented Oct 5, 2021

SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-12/bin/../include/c++/v1/ios:523:12 in std::__1::ios_base::width() const

@MarcoFalke - I don't see how that can be related to this PR. The failure is in validation_chainstate_tests, which I don't touch here (changes are only in fuzz/addrman.cpp).

@theuni
Copy link
Member

theuni commented Oct 5, 2021

utACK 4445211. Agree the failure seems unrelated, looks like some startup race.

Thanks very much @vasild for the benchmarks. Indeed it appears the only functional change here is hitting more locks and test_before_evict is true, good to know those don't slow things down too much.

It appears to me that AddrManDeterministic would no longer be required if AddrMan accepted a reference to a FastRandomContext and grew an equality operator, though I'm not sure there's any need for that.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4445211

@maflcko
Copy link
Member

maflcko commented Oct 6, 2021

Correct, the tsan failure is unrelated (known bug with boost). Sorry for missing this. Re-run the task, but it can be ignored either way.

@mzumsande
Copy link
Contributor

ACK 4445211

Reviewed the code and did a few hours of fuzzing addrman_serdeser, and the results look as expected to me (e.g. addrman of various sizes are created for different fuzzing inputs).

Not related to this PR, but the structure of src/test/addrman.cpp is a bit confusing to me, maybe the fuzz target data_stream_addr_man should be moved from the beginning of the file to the other targets.

@@ -307,7 +302,7 @@ FUZZ_TARGET_INIT(addrman_serdeser, initialize_addrman)

CDataStream data_stream(SER_NETWORK, PROTOCOL_VERSION);

addr_man1.Fill();
FillAddrman(addr_man1, fuzzed_data_provider);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the changes in this PR, but a general observation on the changed fuzz test:
I just noticed that addrman_serdeser is currently the slowest (>60 minutes, slower almost by a factor of 2 compared to the second slowest one) of fuzz tests when run for the qa asset corpus (CI Link), so it seems to be the current bottleneck for a CI run to finish. I wonder if it this is still in the acceptable range, or whether it would make sense to ease the parameters a bit in a follow-up so that it would fill addrman with less addresses.

@fanquake fanquake merged commit 97e2435 into bitcoin:master Oct 8, 2021
@jnewbery jnewbery deleted the 2021-09-addrman-fuzzer-public-methods2 branch October 8, 2021 07:32
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 8, 2021
4445211 [fuzz] Update comment in FillAddrman() (John Newbery)
640476e [fuzz] Make Fill() a free function in fuzz/addrman.cpp (John Newbery)
90ad8ad [fuzz] Make RandAddr() a free function in fuzz/addrman.cpp (John Newbery)
491975c [fuzz] Pass FuzzedDataProvider& into Fill() in addrman fuzz tests (John Newbery)
56303e3 [fuzz] Create a FastRandomContext in addrman fuzz tests (John Newbery)

Pull request description:

  bitcoin#22974 improved the performance of `CAddrMan::Good()` significantly so that it could be used directly in the fuzz tests, instead of those tests reaching directly into addrman's internal data structures.

  This PR continues the work of making the fuzz tests only use `CAddrMan`'s public methods and pulls the `Fill()` and `RandAddr()` methods from `CAddrManDeterministic` into free functions, since they no longer need access to `CAddrMan` internals.

ACKs for top commit:
  theuni:
    utACK 4445211. Agree the failure seems unrelated, looks like some startup race.
  mzumsande:
    ACK 4445211
  vasild:
    ACK 4445211

Tree-SHA512: fcf994e1dedd0012b77f632720b6423d51ceda4eb85c9efe572f2a1150117f9e511114a5206738dd94409137287577f3b01a9998f5237de845410d3d96e7cb7f
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants