-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Raise InitError when peers.dat is invalid or corrupted #22762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fad1965
to
fa50c25
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK |
fa50c25
to
fa1704a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force pushed to address review comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa1704a
fa6c460
to
faff4fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK faff4fd
Concept ACK. Like other open pulls to change or fix addrman behavior, it might be good to have functional/regression test coverage in place first. |
faff4fd
to
b4cc1c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa7003f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa7003f review / debug build clean on each commit, verified the test coverage of #22831 passes on this branch with the following changes
--- a/test/functional/feature_addrman_asmap.py
+++ b/test/functional/feature_addrman_asmap.py
@@ -106,17 +106,21 @@ class AddrmanAsmapTest(BitcoinTestFramework):
self.log.info('Test bitcoind with missing peers.dat and addrman checks')
self.stop_node(0)
os.remove(os.path.join(self.datadir, 'peers.dat'))
- with self.node.assert_debug_log([f'Missing or invalid file {self.datadir}/peers.dat', 'Recreating peers.dat']):
+ msg = f'Creating peers.dat because the file was not found ("{self.datadir}/peers.dat")'
+ with self.node.assert_debug_log(expected_msgs=[msg]):
self.start_node(0, ['-checkaddrman=1'])
def test_peers_dat_with_invalid_network_magic(self):
self.log.info('Test bitcoind with peers.dat having invalid network magic and addrman checks')
self.stop_node(0)
shutil.copyfile(self.asmap_raw, os.path.join(self.datadir, 'peers.dat'))
- with self.node.assert_debug_log(['Invalid network magic number', 'Recreating peers.dat']):
- self.start_node(0, ['-checkaddrman=1'])
- self.node.getnodeaddresses()
+ msg = (
+ 'Error: Invalid or corrupt peers.dat (Invalid network magic number). If you believe this is a bug, please '
+ 'report it to https://github.com/bitcoin/bitcoin/issues. As a workaround, you can move the file '
+ f'("{self.datadir}/peers.dat") out of the way (rename, move, or delete) to have a new one created on the next start.'
+ )
+ self.node.assert_start_raises_init_error(extra_args=['-checkaddrman=1'], expected_msg=msg)
Tested on Pop!_OS and everything works as expected and mentioned in PR description. Master Branch:peers.dat is replaced with a new file if it is corrupt or other issues. PR Branch:A. Corrupt peers.dat file
B. peers.dat from testnet used in regtest
Fix the issue by following the things mentioned in error: Delete peers.dat and restart |
fade9a1 Remove confusing CAddrDB (MarcoFalke) fa7f77b Fix addrdb includes (MarcoFalke) fa3f5d0 Move addrman includes from .h to .cpp (MarcoFalke) Pull request description: Split out from bitcoin#22762 to avoid having to carry it around in (an)other rebase(s) ACKs for top commit: ryanofsky: Code review ACK fade9a1 lsilva01: Code Review ACK bitcoin@fade9a1 Tree-SHA512: 7615fb0b6235d0c1e6f8cd6263dd18c4d95890567c2b797fe6fce6cb12cc85ce6eacbe07dbb6d81b05d179ef03b42edfd61c940e35a1044ce6d363b54c2dae5c
Init should only concern itself with the initialization order, not the detailed initialization logic of every module. Also, inlining logic into a method that is ~800 lines of code, makes it impossible to unit test on its own.
No need to have a function that is only called in one place
fa7003f
to
fa59c6d
Compare
fa59c6d
to
fa55c3d
Compare
Rebased (only change is in tests) |
Code review re-ACK fa55c3d per |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fa55c3d
tACK fa55c3d nit: Will be better if error message was printed only once #22762 (comment) |
InitErrors are printed to the debug log and stderr. I think it makes sense to print them to both instead of only one. |
DeserializeFileDB(path_addr, *addrman, CLIENT_VERSION); | ||
LogPrintf("Loaded %i addresses from peers.dat %dms\n", addrman->size(), GetTimeMillis() - nStart); | ||
} catch (const DbNotFoundError&) { | ||
// Addrman can be in an inconsistent state after failure, reset it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify that, strictly, this isn't needed. DbNotFoundError
means that the file wasn't found, so addrman shouldn't be touched at all.
Does anyone have opinions on whether to remove it or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we need to distinguish somehow this from the other error case, so that we don't signal an error to the caller in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant recreating the addrman can be skipped, because it wasn't touched. Saves two lines of code:
diff --git a/src/addrdb.cpp b/src/addrdb.cpp
index 1e73750ce3..a86f4c3789 100644
--- a/src/addrdb.cpp
+++ b/src/addrdb.cpp
@@ -192,8 +192,6 @@ std::optional<bilingual_str> LoadAddrman(const std::vector<bool>& asmap, const A
DeserializeFileDB(path_addr, *addrman, CLIENT_VERSION);
LogPrintf("Loaded %i addresses from peers.dat %dms\n", addrman->size(), GetTimeMillis() - nStart);
} catch (const DbNotFoundError&) {
- // Addrman can be in an inconsistent state after failure, reset it
- addrman = std::make_unique<CAddrMan>(asmap, /* deterministic */ false, /* consistency_check_ratio */ check_addrman);
LogPrintf("Creating peers.dat because the file was not found (%s)\n", path_addr);
DumpPeerAddresses(args, *addrman);
} catch (const std::exception& e) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would leave the unique_ptr empty and not signal error to the caller. So the caller will continue, possibly dereferencing the empty unique_ptr, leading to undefined behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pointer is initialized in the beginning of the function. Otherwise DumpPeerAddresses(args, *addrman);
wouldn't work either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal LGTM. Built with the suggested diff and tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pointer is initialized in the beginning of the function
ah, well, then it is ok. ACK.
I'm not sure it's entirely desirable to require user intervention when downgrading. When there is actual corruption, this may make sense as it's a sign of a deeper problem, but wiping peers.dat when downgrading to an incompatible version seems entirely expected to me. |
…ers.dat d41ed32 p2p: Avoid InitError when downgrading peers.dat (junderw) Pull request description: fixes #24188 (also see bitcoin/bitcoin#22762 (comment)) When downgrading, a peers.dat with a future version that has a minimum required version larger than the downgraded Bitcoin Core version would cause an InitError. This commit changes this behavior to overwrite the existing peers.dat with a new empty one. ACKs for top commit: prayank23: reACK bitcoin/bitcoin@d41ed32 kallewoof: reACK d41ed32 Tree-SHA512: c8e625fe36ce0b1aab6c8ef7241c8954038bb856f2de27bdc4814dc9a60e51be28815c7d77d0f96eace49687a0cea02deb713978bbd3a5add742f50a675f2a40
d41ed32 p2p: Avoid InitError when downgrading peers.dat (junderw) Pull request description: fixes bitcoin#24188 (also see bitcoin#22762 (comment)) When downgrading, a peers.dat with a future version that has a minimum required version larger than the downgraded Bitcoin Core version would cause an InitError. This commit changes this behavior to overwrite the existing peers.dat with a new empty one. ACKs for top commit: prayank23: reACK bitcoin@d41ed32 kallewoof: reACK d41ed32 Tree-SHA512: c8e625fe36ce0b1aab6c8ef7241c8954038bb856f2de27bdc4814dc9a60e51be28815c7d77d0f96eace49687a0cea02deb713978bbd3a5add742f50a675f2a40
Summary: ``` peers.dat is silently erased when it can not be parsed or when it appears corrupted. Fix that by notifying the user. This might help in the following examples: - The user provided the database, but picked the wrong one. - A future version of Bitcoin Core wrote the file and it can't be read. - The file was corrupted by a logic bug in Bitcoin Core. - The file was corrupted by a disk failure. ``` Backport of [[bitcoin/bitcoin#22762 | core#22762]]. Also cherry-picks a commit from PR22697 that is now required for the addrman_tests to pass, and for which the rationale is still trivially correct: bitcoin/bitcoin@ed9ba8a Test Plan: ninja all check-all Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Differential Revision: https://reviews.bitcoinabc.org/D12308
peers.dat is silently erased when it can not be parsed or when it appears corrupted. Fix that by notifying the user. This might help in the following examples: