Skip to content

Conversation

mzumsande
Copy link
Contributor

@mzumsande mzumsande commented Jan 17, 2023

There have been various reports of corruption of peers.dat recently, see #26599.
As explained in this post in more detail, the underlying issue is likely that we currently serialize AddrMan twice - once for the file stream, once for the hasher that helps create the checksum - and if AddrMan changes in between these two calls, the checksum doesn't match the data and the resulting peers.dat is corrupted.

This PR attempts to fix this by introducing and using HashedSourceWriter - a class that keeps a running hash while serializing data, similar to the existing CHashVerifier which does the analogous thing while unserializing data. Something like this was suggested before, see #10248 (comment).

Fixes #26599 (not by changing the behavior in case of a crash, but by hopefully fixing the underlying cause of these corruptions).

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 17, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK sipa, naumenkogs
Stale ACK MarcoFalke

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25284 ([WIP] consensus: Remove dependency on net (BIP 155 / ADDRV2_FORMAT) by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@mzumsande
Copy link
Contributor Author

mzumsande commented Jan 17, 2023

I also considered extending CHashVerifier instead of creating a new class, but that didn't work out of the box because of how that class is used in

verifier << pindex->pprev->GetBlockHash();
to write data into the hasher without serializing into the file - I'm not too experienced with our streams, so would be interested in opinions on this.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review ACK 49f6fb6

@maflcko
Copy link
Member

maflcko commented Jan 17, 2023

didn't work out of the box because of how that class is used in bitcoin/src/node/blockstorage.cpp

This is removed in #26649, but your approach seems fine, since it targets a backport?

@maflcko maflcko changed the title addrdb: prevent peers.dat corruptions by only serializing once net: prevent peers.dat corruptions by only serializing once Jan 17, 2023
@DrahtBot DrahtBot added the P2P label Jan 17, 2023
@maflcko maflcko added this to the 24.1 milestone Jan 17, 2023
@mzumsande mzumsande force-pushed the 202301_peersdat_corruption branch from 49f6fb6 to 7def755 Compare January 17, 2023 21:16
@maflcko
Copy link
Member

maflcko commented Jan 17, 2023

reutACK 7def755

@mzumsande
Copy link
Contributor Author

mzumsande commented Jan 17, 2023

This is removed in #26649, but your approach seems fine, since it targets a backport?

Yes, I agree it would be nice to backport this, maybe HashedSourceWriter and CHashVerifier could be merged at some later point after #26649.

This class is the counterpart to CHashVerifier, in that it
writes data to an underlying source stream,
while keeping a hash of the written data.
The previous logic would call it once for serializing into the filestream,
and then again for serializing into the hasher. If AddrMan was changed
in between these calls by another thread, the resulting peers.dat would
be corrupt with non-matching checksum and data.
Fix this by using HashedSourceWriter, which writes the data
to the underlying stream and keeps track of the hash in one go.
@mzumsande mzumsande force-pushed the 202301_peersdat_corruption branch from 7def755 to 5eabb61 Compare January 17, 2023 22:22
@sipa
Copy link
Member

sipa commented Jan 17, 2023

I think a possibility is adding a CHashWriter& GetHasher() { return *this; } function to the merged HashedSourceWriter + CHashVerifier class, and then for the blockstorage usage use verifier.GetHasher() << pindex->pprev->GetBlockHash();, as it's data to be fed to just the hash computation rather than the file.

No need for that in this PR though.

@sipa
Copy link
Member

sipa commented Jan 17, 2023

utACK 5eabb61

@naumenkogs
Copy link
Member

utACK 5eabb61

@maflcko maflcko merged commit b5c88a5 into bitcoin:master Jan 19, 2023
@fanquake
Copy link
Member

Added to #26878 for backporting into 24.x.

@fanquake fanquake mentioned this pull request Jan 19, 2023
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jan 19, 2023
This class is the counterpart to CHashVerifier, in that it
writes data to an underlying source stream,
while keeping a hash of the written data.

Github-Pull: bitcoin#26909
Rebased-From: da6c7ae
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jan 19, 2023
The previous logic would call it once for serializing into the filestream,
and then again for serializing into the hasher. If AddrMan was changed
in between these calls by another thread, the resulting peers.dat would
be corrupt with non-matching checksum and data.
Fix this by using HashedSourceWriter, which writes the data
to the underlying stream and keeps track of the hash in one go.

Github-Pull: bitcoin#26909
Rebased-From: 5eabb61
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Jan 19, 2023
This class is the counterpart to CHashVerifier, in that it
writes data to an underlying source stream,
while keeping a hash of the written data.

Github-Pull: bitcoin#26909
Rebased-From: da6c7ae
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Jan 19, 2023
The previous logic would call it once for serializing into the filestream,
and then again for serializing into the hasher. If AddrMan was changed
in between these calls by another thread, the resulting peers.dat would
be corrupt with non-matching checksum and data.
Fix this by using HashedSourceWriter, which writes the data
to the underlying stream and keeps track of the hash in one go.

Github-Pull: bitcoin#26909
Rebased-From: 5eabb61
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jan 19, 2023
This class is the counterpart to CHashVerifier, in that it
writes data to an underlying source stream,
while keeping a hash of the written data.

Github-Pull: bitcoin#26909
Rebased-From: da6c7ae
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Jan 19, 2023
The previous logic would call it once for serializing into the filestream,
and then again for serializing into the hasher. If AddrMan was changed
in between these calls by another thread, the resulting peers.dat would
be corrupt with non-matching checksum and data.
Fix this by using HashedSourceWriter, which writes the data
to the underlying stream and keeps track of the hash in one go.

Github-Pull: bitcoin#26909
Rebased-From: 5eabb61
@fanquake fanquake mentioned this pull request Jan 19, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jan 19, 2023
…lizing once

5eabb61 addrdb: Only call Serialize() once (Martin Zumsande)
da6c7ae hash: add HashedSourceWriter (Martin Zumsande)

Pull request description:

  There have been various reports of corruption of `peers.dat` recently, see bitcoin#26599.
  As explained in [this post](bitcoin#26599 (comment)) in more detail, the underlying issue is likely that we currently serialize `AddrMan` twice - once for the file stream, once for the hasher that helps create the checksum - and if `AddrMan` changes in between these two calls, the checksum doesn't match the data and the resulting `peers.dat` is corrupted.

  This PR attempts to fix this by introducing and using `HashedSourceWriter` - a class that keeps a running hash while serializing data, similar to the existing `CHashVerifier` which does the analogous thing while unserializing data. Something like this was suggested before, see bitcoin#10248 (comment).

  Fixes bitcoin#26599 (not by changing the behavior in case of a crash, but by hopefully fixing the underlying cause of these corruptions).

ACKs for top commit:
  sipa:
    utACK 5eabb61
  naumenkogs:
    utACK 5eabb61

Tree-SHA512: f19ad37c37088d3a9825c60de2efe85cc2b7a21b79b9156024d33439e021443ef977a5f8424a7981bcc13d73d11e30eaa82de60e14d88b3568a67b03e02b153b
@mzumsande mzumsande deleted the 202301_peersdat_corruption branch January 19, 2023 15:44
fanquake added a commit that referenced this pull request Feb 16, 2023
52376d9 depends: fix systemtap download URL (fanquake)
af86266 23.x Add missing includes to fix gcc-13 compile error (fanquake)
3987687 Add missing includes to fix gcc-13 compile error (MarcoFalke)
412cd1a addrdb: Only call Serialize() once (Martin Zumsande)
fd94bef hash: add HashedSourceWriter (Martin Zumsande)

Pull request description:

  Backports:
  * #26909
  * #26924
  * #26944

ACKs for top commit:
  instagibbs:
    ACK 52376d9

Tree-SHA512: fa6463d5086667107b4ce4d87545e0b3f9b7841a52761a4dc6286377f958ecc026ed6694d1cf1e91cbad686309b5d637608f3991c46a20b02421318a804ffcea
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Feb 22, 2023
This class is the counterpart to CHashVerifier, in that it
writes data to an underlying source stream,
while keeping a hash of the written data.

Github-Pull: bitcoin#26909
Rebased-From: da6c7ae
fanquake pushed a commit to fanquake/bitcoin that referenced this pull request Feb 22, 2023
The previous logic would call it once for serializing into the filestream,
and then again for serializing into the hasher. If AddrMan was changed
in between these calls by another thread, the resulting peers.dat would
be corrupt with non-matching checksum and data.
Fix this by using HashedSourceWriter, which writes the data
to the underlying stream and keeps track of the hash in one go.

Github-Pull: bitcoin#26909
Rebased-From: 5eabb61
glozow added a commit that referenced this pull request Feb 27, 2023
784a754 wallet, rpc: Update migratewallet help text for encrypted wallets (Andrew Chow)
debcfe3 tests: Tests for migrating wallets by name, and providing passphrase (Andrew Chow)
ccc72fe wallet: Be able to unlock the wallet for migration (Andrew Chow)
50dd8b1 rpc: Allow users to specify wallet name for migratewallet (Andrew Chow)
648b062 wallet: Allow MigrateLegacyToDescriptor to take a wallet name (Andrew Chow)
ab3bd45 i2p: use consistent number of tunnels with i2pd and Java I2P (Vasil Dimov)
29cdf42 i2p: lower the number of tunnels for transient sessions (Vasil Dimov)
5027e93 i2p: reuse created I2P sessions if not used (Vasil Dimov)
a62c541 wallet: reuse change dest when recreating TX with avoidpartialspends (Matthew Zipkin)
64e7db6 Zero out wallet master key upon lock (John Moffett)
b7e242e Correctly limit overview transaction list (John Moffett)
cff6718 depends: fix systemtap download URL (fanquake)
7cf73df Add missing includes to fix gcc-13 compile error (MarcoFalke)
07397cd addrdb: Only call Serialize() once (Martin Zumsande)
91f83db hash: add HashedSourceWriter (Martin Zumsande)
5c824ac For feebump, ignore abandoned descendant spends (John Moffett)
428dcd5 wallet: Skip rescanning if wallet is more recent than tip (Andrew Chow)
cbcdafa test: wallet: check that labels are migrated to watchonly wallet (Sebastian Falbesoner)
342abfb wallet: fully migrate address book entries for watchonly/solvable wallets (Sebastian Falbesoner)

Pull request description:

  Backports:
  * #26595
  * #26675
  * #26679
  * #26761
  * #26837
  * #26909
  * #26924
  * #26944
  * bitcoin-core/gui#704
  * #27053
  * #27080

ACKs for top commit:
  instagibbs:
    ACK 784a754
  achow101:
    ACK 784a754
  hebasto:
    ACK 784a754, I've made backporting locally and got a diff between my branch and this PR as follows:

Tree-SHA512: 8ea84aa02d7907ff1e202e1302b441ce9ed2198bf383620ad40056a5d7e8ea88e1047abef0b92d85648016bf9b3195c974be3806ccebd85bef4f85c326869e43
@bitcoin bitcoin locked and limited conversation to collaborators Jan 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not crash if peers.dat is corrupted
6 participants