Skip to content

Conversation

jnewbery
Copy link
Contributor

Commit 181a120 introduced an initialization order bug: CAddrMan's m_asmap must be set before deserializing peers.dat.

The first commit restores the correct initialization order. The remaining commits make CAddrMan::m_asmap usage safer:

  • don't reach into CAddrMan's internal data from CConnMan
  • set m_asmap in the initializer list and make it const
  • make m_asmap private, and access it (as a reference to const) from a getter.

This ensures that peers.dat deserialization must happen after setting m_asmap, since m_asmap is set during CAddrMan construction.

@fanquake fanquake added the P2P label Aug 24, 2021
@jnewbery jnewbery changed the title [init] Read/decode asmap before constructing addrman init: Fix asmap/addrman initialization order bug Aug 24, 2021
@jonatack
Copy link
Member

jonatack commented Aug 24, 2021

I'm unable to build (with clang 13, CI seems to agree). The first commit compiles; testing.

build output

bench/addrman.cpp:76:18: error: no matching constructor for initialization of 'CAddrMan'
        CAddrMan addrman{/* deterministic */ false, /* consistency_check_ratio */ 0};
                 ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./addrman.h:181:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
class CAddrMan
      ^
./addrman.h:458:14: note: candidate constructor not viable: requires 3 arguments, but 2 were provided
    explicit CAddrMan(std::vector<bool>* asmap, bool deterministic, int32_t consistency_check_ratio);
             ^
bench/addrman.cpp:83:14: error: no matching constructor for initialization of 'CAddrMan'
    CAddrMan addrman(/* deterministic */ false, /* consistency_check_ratio */ 0);
             ^                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./addrman.h:181:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
class CAddrMan
      ^
./addrman.h:458:14: note: candidate constructor not viable: requires 3 arguments, but 2 were provided
    explicit CAddrMan(std::vector<bool>* asmap, bool deterministic, int32_t consistency_check_ratio);
             ^
bench/addrman.cpp:95:14: error: no matching constructor for initialization of 'CAddrMan'
    CAddrMan addrman(/* deterministic */ false, /* consistency_check_ratio */ 0);
             ^                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./addrman.h:181:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 2 were provided
class CAddrMan
      ^
./addrman.h:458:14: note: candidate constructor not viable: requires 3 arguments, but 2 were provided
    explicit CAddrMan(std::vector<bool>* asmap, bool deterministic, int32_t consistency_check_ratio);
             ^

@jnewbery jnewbery force-pushed the 2021-08-asmap-addrman-init branch from e7f17b5 to 4ffc8b8 Compare August 24, 2021 13:42
@jnewbery
Copy link
Contributor Author

I'm unable to build (with clang 13, CI seems to agree)

Oops. I don't build with bench locally to save a bit of time. Now fixed.

@jonatack
Copy link
Member

jonatack commented Aug 24, 2021

Test with only the first commit: the addrman checks pass throughout, but I lost 18k addresses on startup, perhaps because I had the -asmap config option commented out to be able to run bitcoind on mainnet, and I uncommented it again to test.

2021-08-24T13:39:14Z [init] Opened asmap file "/home/jon/projects/bitcoin/asmap/demo.map" (932999 bytes) from disk
2021-08-24T13:39:15Z [init] Using asmap version 7ead927689f884bc4021618017cc1438d9328578779edd2adfbed044fb52928e for IP bucketing
2021-08-24T13:39:15Z [init] init message: Loading P2P addresses…
2021-08-24T13:39:16Z [init] Bucketing method was updated, re-bucketing addrman entries from disk
2021-08-24T13:39:25Z [init] addrman lost 18599 new and 1 tried addresses due to collisions or invalid addresses
2021-08-24T13:39:25Z [init] Addrman checks started: new 26373, tried 72, total 26445
2021-08-24T13:39:25Z [init] Addrman checks completed successfully
2021-08-24T13:39:25Z [init] Loaded 26445 addresses from peers.dat  9577ms

Update: deleted my peers.dat and re-tried building on the PR branch after the update; debug build now succeeds and retesting with multiple restarts initially seems fine.

2021-08-24T14:11:09Z [init] Opened asmap file "/home/jon/projects/bitcoin/asmap/demo.map" (932999 bytes) from disk
2021-08-24T14:11:11Z [init] Using asmap version 7ead927689f884bc4021618017cc1438d9328578779edd2adfbed044fb52928e for IP bucketing
2021-08-24T14:11:11Z [init] init message: Loading P2P addresses…
2021-08-24T14:11:11Z [init] Addrman checks started: new 6828, tried 10, total 6838
2021-08-24T14:11:12Z [init] Addrman checks completed successfully
2021-08-24T14:11:12Z [init] Loaded 6838 addresses from peers.dat  756ms
tested with these config args

2021-08-24T13:41:49Z [init] Config file arg: blockfilterindex="1"
2021-08-24T13:41:49Z [init] Config file arg: checkaddrman="1"
2021-08-24T13:41:49Z [init] Config file arg: checkblocks="12"
2021-08-24T13:41:49Z [init] Config file arg: checklevel="4"
2021-08-24T13:41:49Z [init] Config file arg: checkmempool="1"
2021-08-24T13:41:49Z [init] Config file arg: coinstatsindex="1"
2021-08-24T13:41:49Z [init] Config file arg: debug="tor"
2021-08-24T13:41:49Z [init] Config file arg: debug="i2p"
2021-08-24T13:41:49Z [init] Config file arg: debug="addrman"
2021-08-24T13:41:49Z [init] Config file arg: i2psam="127.0.0.1:7656"
2021-08-24T13:41:49Z [init] Config file arg: logips="1"
2021-08-24T13:41:49Z [init] Config file arg: logthreadnames="1"
2021-08-24T13:41:49Z [init] Config file arg: par="1"
2021-08-24T13:41:49Z [init] Config file arg: peerbloomfilters="1"
2021-08-24T13:41:49Z [init] Config file arg: txindex="1"
2021-08-24T13:41:49Z [init] Config file arg: walletbroadcast="1"
2021-08-24T13:41:49Z [init] Config file arg: [main] asmap="/home/jon/projects/bitcoin/asmap/demo.map"

@jonatack
Copy link
Member

jonatack commented Aug 24, 2021

Further testing seems ok (provided that losing addresses on rebucketing when toggling -asmap on/off is expected)

Restarted after turning off -asmap:

2021-08-24T14:19:00Z [init] Using /16 prefix for IP bucketing
2021-08-24T14:19:00Z [init] init message: Loading P2P addresses…
2021-08-24T14:19:01Z [init] Bucketing method was updated, re-bucketing addrman entries from disk
2021-08-24T14:19:01Z [init] addrman lost 2147 new and 0 tried addresses due to collisions or invalid addresses
2021-08-24T14:19:01Z [init] Loaded 9928 addresses from peers.dat  463ms

Restarted again with -asmap off:

2021-08-24T14:20:59Z [init] Using /16 prefix for IP bucketing
2021-08-24T14:20:59Z [init] init message: Loading P2P addresses…
2021-08-24T14:21:00Z [init] Loaded 10760 addresses from peers.dat  169ms

Restarted with -asmap back on:

2021-08-24T14:22:44Z [init] Opened asmap file "/home/jon/projects/bitcoin/asmap/demo.map" (932999 bytes) from disk
2021-08-24T14:22:46Z [init] Using asmap version 7ead927689f884bc4021618017cc1438d9328578779edd2adfbed044fb52928e for IP bucketing
2021-08-24T14:22:46Z [init] init message: Loading P2P addresses…
2021-08-24T14:22:46Z [init] Bucketing method was updated, re-bucketing addrman entries from disk
2021-08-24T14:22:48Z [init] addrman lost 510 new and 0 tried addresses due to collisions or invalid addresses
2021-08-24T14:22:48Z [init] Loaded 10462 addresses from peers.dat  2802ms

Restarted again with -asmap on and reduced frequency of checkaddrman=100, leaving running this way for now:

2021-08-24T14:24:41Z [init] Opened asmap file "/home/jon/projects/bitcoin/asmap/demo.map" (932999 bytes) from disk
2021-08-24T14:24:43Z [init] Using asmap version 7ead927689f884bc4021618017cc1438d9328578779edd2adfbed044fb52928e for IP bucketing
2021-08-24T14:24:43Z [init] init message: Loading P2P addresses…
2021-08-24T14:24:43Z [init] Loaded 11203 addresses from peers.dat  737ms
...
2021-08-24T14:30:12Z [msghand] Addrman checks started: new 15479, tried 19, total 15498
2021-08-24T14:30:13Z [msghand] Addrman checks completed successfully

Addrman checks passing throughout.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4ffc8b8

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK 4ffc8b8 modulo comments below and feedback from fuzzing people wrt invalidating affected corpii

Suggestions:

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 25, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22831 (p2p, bugfix: fix addrman tried table corruption on restart with asmap by jonatack)
  • #22778 (net processing: Reduce resource usage for inbound block-relay-only connections by jnewbery)
  • #22766 (refactor: Clarify and disable unused ArgsManager flags by ryanofsky)
  • #22762 (Raise InitError when peers.dat is invalid or corrupted by MarcoFalke)
  • #21160 (net/net processing: Move tx inventory into net_processing by jnewbery)
  • #19860 (Improve diversification of new connections: privacy and stability by naumenkogs)
  • #16545 (refactor: Implement missing error checking for ArgsManager flags by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Commit 181a120 introduced an initialization order bug: CAddrMan's
m_asmap must be set before deserializing peers.dat. Restore that
ordering.

review hint: use

`git diff --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space`
CAddrMan::m_asmap is now set directly in AppInitMain() so
CConnMan::SetAsmap() is no longer required.
@jnewbery
Copy link
Contributor Author

Thanks for the reviews @vasild @jonatack. I've addressed all your inline comments.

@jonatack - If you have any suggested wording, I'm happy to incorporate it into the PR description.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d4fda4f

This allows us to make it const.
The m_ prefix indicates that a variable is a data member. Using it as
a parameter name is misleading.

Also update the name of the function from copyStats to CopyStats to
comply with our style guide.
Add a GetAsmap() getter function that returns a reference to const.
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 724c497

@mzumsande
Copy link
Contributor

Tested ACK 724c497

Code changes look correct to me.
I verified that current master failed the addrman checks (-checkaddrman) when switching from no-asmap mode to asmap mode with an existing peers.dat, while I experienced no problems over multiple switches on this branch.

@jonatack
Copy link
Member

Suggestions:

* link to [erisian.com.au/bitcoin-core-dev/log-2021-08-23.html#l-263](https://www.erisian.com.au/bitcoin-core-dev/log-2021-08-23.html#l-263) (up to line 382) in the PR description for context

* describe the symptoms/consequences of the issue and how to reproduce

* maybe save refactoring/renaming for a non-bug-fix

* add a passing/failing test that demonstrates the bug is fixed

Addressed these suggestions in #22831 that pulls in the first commit here. That patch could be merged first and this one rebased on it (or vice-versa), or you could optionally pull in the test commits and use the PR description--as you prefer.

@amitiuttarwar
Copy link
Contributor

approach ACK, I want to understand the std::move implications & then I'm ready to leave a review ACK.
did you consider making m_asmap an optional arg for the addrman constructor?
but overall, I like these changes- in addition to fixing the bug, making asmap a private & read-only member of addrman seems like an incremental improvement to me.

@amitiuttarwar
Copy link
Contributor

code review but utACK 724c497

@naumenkogs
Copy link
Member

utACK 724c497

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review ACK 724c497 👫

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK 724c4975622bc22cedc3f3814dfc8e66cf8371f7 👫
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjbBgwAmjfKpl2pPRl+MlWki+w1+qRbTqdMIiBrEu6wt79pkznSY6fNvWXAXCiI
xyqHDMf8psDsZiL3+gPq7qW5cBBPTZb8HxYvJAIqo2J5Gz2aqzNrgB4vegom0NgH
/QTVsdsgIssn2YmMXE6LpWU2nYaMgPqJPKAu4D3GZ2IEDitSLYYNGcm4M9MDYDQ3
FP+QNf+7qTE29UshNXP86iSQ2ITuKN2eXuA+Zm6BpaFmG8lkIsfdjJFQVdDDbCWX
7lB1M/v5mrIK9LylIv/X6iYJi2FMT4UdqICoudp4L+V1m7lElw3Z/ymMjaPuM0NV
9o+sJUbL+/lTFH3yncplmwkEZDdv+M4q00csIw5wAYvgpTRw0aP6YJPL1VzUISsK
R7CStBUNuIlX42hnGwHqUhs4fxwzevfg1tRTJDkk3+Z/y/E3u+Hj+eEBfoDBnU5C
/eqAT4WXAhdVZdqUcxR841Aki67Agsb2bv+pv9sAskG5OTHyynccu3T4Eh4XwsWL
MS67bp/6
=zBMj
-----END PGP SIGNATURE-----

Timestamp of file with hash 9a48951e3c61060be14eb894092e9ac4ae9406fa411fed5ca58fea1d1415078b -

@@ -593,6 +577,8 @@ class CAddrMan
Check();
}

const std::vector<bool>& GetAsmap() const { return m_asmap; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add LIFETIMEBOUND when returning a pointer to memory in this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

I actually plan to fully encapsulate asmap in its own component in #22910, so we're not passing out references to private data.

@maflcko maflcko merged commit a8fdfea into bitcoin:master Sep 6, 2021
@jnewbery jnewbery deleted the 2021-08-asmap-addrman-init branch September 7, 2021 09:06
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 7, 2021
@jamesob
Copy link
Contributor

jamesob commented Sep 10, 2021

Addressed these suggestions in #22831 that pulls in the first commit here. That patch could be merged first and this one rebased on it (or vice-versa), or you could optionally pull in the test commits and use the PR description--as you prefer.

Why was this comment, made 12 days ago and 8 days before merge, ignored?

It would be one thing if this change was a straightforward revert or a simple reordering, but at face value it is not obvious that this change fixes the fairly severe bug that @jonatack discovered. It is troubling that a patch of this complexity was merged without an accompanying test verifying its efficacy, especially when one was made readily available.

@jonatack
Copy link
Member

Concept ACK 4ffc8b8 modulo comments below and feedback from fuzzing people wrt invalidating affected corpii

Suggestions:

* link to [erisian.com.au/bitcoin-core-dev/log-2021-08-23.html#l-263](https://www.erisian.com.au/bitcoin-core-dev/log-2021-08-23.html#l-263) (up to line 382) in the PR description for context

* describe the symptoms/consequences of the issue and how to reproduce

* maybe save refactoring/renaming for a non-bug-fix

* add a passing/failing test that demonstrates the bug is fixed

@jamesob I didn't understand either. The suggestions in #22791 (review) were also essentially ignored.

@maflcko
Copy link
Member

maflcko commented Sep 11, 2021

There were 5 ACKs, including one tested ACK, before I added the 6th ACK. I think this number of review ACKs (in addition to the tests to verify the fix by people who didn't leave a comment) is sufficient to merge a pull request. This is overall a higher than average review interest compared to other pull reqeusts. If the alternative pull doesn't get any ACKs, it would appear slightly odd to me to merge it instead.

@jnewbery
Copy link
Contributor Author

Why was this comment, made 12 days ago and 8 days before merge, ignored?

It would be one thing if this change was a straightforward revert or a simple reordering, but at face value it is not obvious that this change fixes the fairly severe bug that @jonatack discovered. It is troubling that a patch of this complexity was merged without an accompanying test verifying its efficacy, especially when one was made readily available.

@jamesob (#22791 (comment))

The suggestions in #22791 (review) were also essentially ignored.

@jonatack (#22791 (comment))

Please see #22831 (comment).

fanquake added a commit that referenced this pull request Apr 22, 2022
36f814c [netgroupman] Remove NetGroupManager::GetAsmap() (John Newbery)
4709fc2 [netgroupman] Move asmap checksum calculation to NetGroupManager (John Newbery)
1b978a7 [netgroupman] Move GetMappedAS() and GetGroup() logic to NetGroupManager (John Newbery)
ddb4101 [net] Only use public CNetAddr functions and data in GetMappedAS() and GetGroup() (John Newbery)
6b22681 [netgroupman] Add GetMappedAS() and GetGroup() (John Newbery)
1943156 [net] Move asmap into NetGroupManager (John Newbery)
17c24d4 [init] Add netgroupman to node.context (John Newbery)
9b38367 [build] Add netgroup.cpp|h (John Newbery)

Pull request description:

  The asmap data is currently owned by addrman, but is used by both addrman and connman. #22791 made the data const and private (so that it can't be updated by other components), but it is still passed out of addrman as a reference to const, and used by `CNetAddress` to calculate the group and AS of the net address.

  This RFC PR proposes to move all asmap data and logic into a new `NetGroupManager` component. This is initialized at startup, and the client components addrman and connman simply call `NetGroupManager::GetGroup(const CAddress&)` and `NetGroupManager::GetMappedAS(const CAddress&)` to get the net group and AS of an address.

ACKs for top commit:
  mzumsande:
    Code Review ACK 36f814c
  jnewbery:
    CI failure seems spurious. I rebased onto latest master to trigger a new CI run, but whilst I was doing that, mzumsande ACKed 36f814c, so I've reverted to that.
  dergoegge:
    Code review ACK 36f814c

Tree-SHA512: 244a89cdfd720d8cce679eae5b7951e1b46b37835fccb6bdfa362856761bb110e79e263a6eeee8246140890f3bee2850e9baa7bc14a388a588e0e29b9d275175
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 18, 2022
Summary:
```
Commit 181a1207 introduced an initialization order bug: CAddrMan's m_asmap must be set before deserializing peers.dat. Restore that ordering.
```

Note that we didn't introduce the bug (yet), mostly because most of the referenced PR is not relevant for us so it has been purposedly delayed. A note is added to the code so that later backport will not introduce more issues.

Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]:
bitcoin/bitcoin@50fd770

Test Plan:
  ninja all check-all

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12291
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 18, 2022
Summary:
```
CAddrMan::m_asmap is now set directly in AppInitMain() so
CConnMan::SetAsmap() is no longer required.
```

Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]:
bitcoin/bitcoin@5932478

Depends on D12291.

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12292
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 18, 2022
Summary:
```
This allows us to make it const.
```

Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]:
bitcoin/bitcoin@f572f2b#diff-b1e19192258d83199d8adaa5ac31f067af98f63554bfdd679bd8e8073815e69d

Depends on D12292.

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12293
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 18, 2022
Summary:
```
Add a GetAsmap() getter function that returns a reference to const.
```

Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]:
bitcoin/bitcoin@5840476

Depends on D12293.

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, sdulfari

Reviewed By: #bitcoin_abc, sdulfari

Differential Revision: https://reviews.bitcoinabc.org/D12294
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.