-
Notifications
You must be signed in to change notification settings - Fork 37.7k
init: Fix asmap/addrman initialization order bug #22791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm unable to build (with clang 13, CI seems to agree). The first commit compiles; testing. build output
|
e7f17b5
to
4ffc8b8
Compare
Oops. I don't build with bench locally to save a bit of time. Now fixed. |
Test with only the first commit: the addrman checks pass throughout, but I lost 18k addresses on startup, perhaps because I had the -asmap config option commented out to be able to run bitcoind on mainnet, and I uncommented it again to test.
Update: deleted my peers.dat and re-tried building on the PR branch after the update; debug build now succeeds and retesting with multiple restarts initially seems fine.
tested with these config args
|
Further testing seems ok (provided that losing addresses on rebucketing when toggling -asmap on/off is expected)
Restarted after turning off -asmap:
Restarted again with -asmap off:
Restarted with -asmap back on:
Restarted again with -asmap on and reduced frequency of checkaddrman=100, leaving running this way for now:
Addrman checks passing throughout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 4ffc8b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK 4ffc8b8 modulo comments below and feedback from fuzzing people wrt invalidating affected corpii
Suggestions:
- link to https://www.erisian.com.au/bitcoin-core-dev/log-2021-08-23.html#l-263 (up to line 382) in the PR description for context
- describe the symptoms/consequences of the issue and how to reproduce
- maybe save refactoring/renaming for a non-bug-fix
- add a passing/failing test that demonstrates the bug is fixed
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Commit 181a120 introduced an initialization order bug: CAddrMan's m_asmap must be set before deserializing peers.dat. Restore that ordering. review hint: use `git diff --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space`
CAddrMan::m_asmap is now set directly in AppInitMain() so CConnMan::SetAsmap() is no longer required.
4ffc8b8
to
8a4c815
Compare
8a4c815
to
d4fda4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK d4fda4f
This allows us to make it const.
The m_ prefix indicates that a variable is a data member. Using it as a parameter name is misleading. Also update the name of the function from copyStats to CopyStats to comply with our style guide.
Add a GetAsmap() getter function that returns a reference to const.
d4fda4f
to
724c497
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 724c497
Tested ACK 724c497 Code changes look correct to me. |
Addressed these suggestions in #22831 that pulls in the first commit here. That patch could be merged first and this one rebased on it (or vice-versa), or you could optionally pull in the test commits and use the PR description--as you prefer. |
approach ACK, I want to understand the |
code review but utACK 724c497 |
utACK 724c497 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review ACK 724c497 👫
Show signature and timestamp
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
review ACK 724c4975622bc22cedc3f3814dfc8e66cf8371f7 👫
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjbBgwAmjfKpl2pPRl+MlWki+w1+qRbTqdMIiBrEu6wt79pkznSY6fNvWXAXCiI
xyqHDMf8psDsZiL3+gPq7qW5cBBPTZb8HxYvJAIqo2J5Gz2aqzNrgB4vegom0NgH
/QTVsdsgIssn2YmMXE6LpWU2nYaMgPqJPKAu4D3GZ2IEDitSLYYNGcm4M9MDYDQ3
FP+QNf+7qTE29UshNXP86iSQ2ITuKN2eXuA+Zm6BpaFmG8lkIsfdjJFQVdDDbCWX
7lB1M/v5mrIK9LylIv/X6iYJi2FMT4UdqICoudp4L+V1m7lElw3Z/ymMjaPuM0NV
9o+sJUbL+/lTFH3yncplmwkEZDdv+M4q00csIw5wAYvgpTRw0aP6YJPL1VzUISsK
R7CStBUNuIlX42hnGwHqUhs4fxwzevfg1tRTJDkk3+Z/y/E3u+Hj+eEBfoDBnU5C
/eqAT4WXAhdVZdqUcxR841Aki67Agsb2bv+pv9sAskG5OTHyynccu3T4Eh4XwsWL
MS67bp/6
=zBMj
-----END PGP SIGNATURE-----
Timestamp of file with hash 9a48951e3c61060be14eb894092e9ac4ae9406fa411fed5ca58fea1d1415078b -
@@ -593,6 +577,8 @@ class CAddrMan | |||
Check(); | |||
} | |||
|
|||
const std::vector<bool>& GetAsmap() const { return m_asmap; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add LIFETIMEBOUND when returning a pointer to memory in this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
I actually plan to fully encapsulate asmap in its own component in #22910, so we're not passing out references to private data.
Why was this comment, made 12 days ago and 8 days before merge, ignored? It would be one thing if this change was a straightforward revert or a simple reordering, but at face value it is not obvious that this change fixes the fairly severe bug that @jonatack discovered. It is troubling that a patch of this complexity was merged without an accompanying test verifying its efficacy, especially when one was made readily available. |
@jamesob I didn't understand either. The suggestions in #22791 (review) were also essentially ignored. |
There were 5 ACKs, including one tested ACK, before I added the 6th ACK. I think this number of review ACKs (in addition to the tests to verify the fix by people who didn't leave a comment) is sufficient to merge a pull request. This is overall a higher than average review interest compared to other pull reqeusts. If the alternative pull doesn't get any ACKs, it would appear slightly odd to me to merge it instead. |
Please see #22831 (comment). |
36f814c [netgroupman] Remove NetGroupManager::GetAsmap() (John Newbery) 4709fc2 [netgroupman] Move asmap checksum calculation to NetGroupManager (John Newbery) 1b978a7 [netgroupman] Move GetMappedAS() and GetGroup() logic to NetGroupManager (John Newbery) ddb4101 [net] Only use public CNetAddr functions and data in GetMappedAS() and GetGroup() (John Newbery) 6b22681 [netgroupman] Add GetMappedAS() and GetGroup() (John Newbery) 1943156 [net] Move asmap into NetGroupManager (John Newbery) 17c24d4 [init] Add netgroupman to node.context (John Newbery) 9b38367 [build] Add netgroup.cpp|h (John Newbery) Pull request description: The asmap data is currently owned by addrman, but is used by both addrman and connman. #22791 made the data const and private (so that it can't be updated by other components), but it is still passed out of addrman as a reference to const, and used by `CNetAddress` to calculate the group and AS of the net address. This RFC PR proposes to move all asmap data and logic into a new `NetGroupManager` component. This is initialized at startup, and the client components addrman and connman simply call `NetGroupManager::GetGroup(const CAddress&)` and `NetGroupManager::GetMappedAS(const CAddress&)` to get the net group and AS of an address. ACKs for top commit: mzumsande: Code Review ACK 36f814c jnewbery: CI failure seems spurious. I rebased onto latest master to trigger a new CI run, but whilst I was doing that, mzumsande ACKed 36f814c, so I've reverted to that. dergoegge: Code review ACK 36f814c Tree-SHA512: 244a89cdfd720d8cce679eae5b7951e1b46b37835fccb6bdfa362856761bb110e79e263a6eeee8246140890f3bee2850e9baa7bc14a388a588e0e29b9d275175
Summary: ``` Commit 181a1207 introduced an initialization order bug: CAddrMan's m_asmap must be set before deserializing peers.dat. Restore that ordering. ``` Note that we didn't introduce the bug (yet), mostly because most of the referenced PR is not relevant for us so it has been purposedly delayed. A note is added to the code so that later backport will not introduce more issues. Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]: bitcoin/bitcoin@50fd770 Test Plan: ninja all check-all Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Differential Revision: https://reviews.bitcoinabc.org/D12291
Summary: ``` CAddrMan::m_asmap is now set directly in AppInitMain() so CConnMan::SetAsmap() is no longer required. ``` Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]: bitcoin/bitcoin@5932478 Depends on D12291. Test Plan: ninja all check Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Differential Revision: https://reviews.bitcoinabc.org/D12292
Summary: ``` This allows us to make it const. ``` Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]: bitcoin/bitcoin@f572f2b#diff-b1e19192258d83199d8adaa5ac31f067af98f63554bfdd679bd8e8073815e69d Depends on D12292. Test Plan: ninja all check Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Differential Revision: https://reviews.bitcoinabc.org/D12293
Summary: ``` Add a GetAsmap() getter function that returns a reference to const. ``` Partial backport of [[bitcoin/bitcoin#22791 | core#22791]]: bitcoin/bitcoin@5840476 Depends on D12293. Test Plan: ninja all check Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D12294
Commit 181a120 introduced an initialization order bug: CAddrMan's m_asmap must be set before deserializing peers.dat.
The first commit restores the correct initialization order. The remaining commits make
CAddrMan::m_asmap
usage safer:CAddrMan
's internal data fromCConnMan
m_asmap
in the initializer list and make it constm_asmap
private, and access it (as a reference to const) from a getter.This ensures that peers.dat deserialization must happen after setting m_asmap, since m_asmap is set during CAddrMan construction.