Skip to content

Conversation

maflcko
Copy link
Member

@maflcko maflcko commented Aug 18, 2021

Assert should only be used for program internal logic errors, not to sanitize external user input.

The assert was introduced via the debug-only runtime option -checkaddrman in commit 803ef70, thus won't need a backport.

Also, it doesn't really make sense to continue when the deserialized addrman doesn't pass the sanity check.

For example, if nLastSuccess is negative, it would later result in integer overflows. Thus, this patch fixes #22931.

Also,
Fixes #22503
Fixes #22504
Fixes #22519

Closes #22498

Steps to test:

mkdir -p /tmp/test_235/regtest/
echo 'H4sIAAAAAAAAA/u1f+stZmUGYgELgwPRakfBKBgFo2AUjIJRMApGwSgYBaNgFIyCUTBswdyGpFnLjUKjP9e0bvjYusl6b+L2e7Vs2dd6N//Pua0/xQUALJAn93IQAAA=' | base64 --decode | zcat > /tmp/test_235/regtest/peers.dat
./src/qt/bitcoin-qt -regtest -datadir=/tmp/test_235/ -checkaddrman=1 -printtoconsole | grep -A2 'Loading P2P addresses'

Output before:

2021-09-10T11:28:37Z init message: Loading P2P addresses…
2021-09-10T11:28:37Z ADDRMAN CONSISTENCY CHECK FAILED!!! err=-16
bitcoin-qt: addrman.cpp:765: void CAddrMan::Check() const: Assertion `false' failed.

(program crashes)

Output after:

2021-09-10T11:26:00Z init message: Loading P2P addresses…
2021-09-10T11:26:00Z Error: Invalid or corrupt peers.dat (Corrupt data. Consistency check failed with code -16: iostream error). If you believe this is a bug, please report it to https://github.com/bitcoin/bitcoin/issues. As a workaround, you can move the file ("/tmp/test_235/regtest/peers.dat") out of the way (rename, move, or delete) to have a new one created on the next start.

(program exits)

@mzumsande
Copy link
Contributor

I agree that corrupt data shouldn't assert.

Check() used to be restricted to tests / use of the checkaddrman option, it was never run per default. This PR makes it run always for Unserialize(), so it introduces stricter rules for the level of inconsistencies we tolerate from peers.dat.
I think this change of default behavior should be mentioned in the OP and maybe somewhere in the code, because it also introduces new possibilities of p2p addrman poisoning to make us drop our peers.dat - so we should be confident all checks in Check() are correct before enabling this.

@maflcko maflcko changed the title addrman: Avoid crash on corrupt data addrman: Avoid crash on corrupt data, Run Check after deserialize Aug 18, 2021
@maflcko maflcko changed the title addrman: Avoid crash on corrupt data, Run Check after deserialize addrman: Avoid crash on corrupt data, Force Check after deserialize Aug 18, 2021
@maflcko
Copy link
Member Author

maflcko commented Aug 18, 2021

Thanks, renamed PR title

@mzumsande
Copy link
Contributor

Thanks! I think there is a bug in the addman_serdeser fuzz test (see CI fail) that would need fixing, see mzumsande@f484061

@jnewbery
Copy link
Contributor

With this change, if we have a bug that causes us to write a peers.dat file that's corrupt in any way, then it would mostly be undetected - the addrman would be cleared and startup would continue as normal. As @mzumsande points out, if it's possible to trigger such a bug over p2p, then an adversary could cause victims to lose their entire addrman on restart.

It might be safer to do the ForceCheck, but then catch the exception, rename peers.dat to peers.bak, and terminate with some error message like "Corrupt peers.dat file found. addrman database has been wiped and peers.dat has been renamed to peers.bak. Please report this to the Bitcoin Core developers." That way, we'd get bug reports if there was a bug in our serialization code, and there may be a way to salvage the peers.bak file if needed.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK fab42d7

I think that the check should also be run (unconditionally) before serialize and if a problem is detected then don't write the corrupted stuff on disk.

Keeping the corrupted peers.dat on unserialize seems like a reasonable thing to do, but right now there are no tools to salvage it, so it would be useless. Maybe just stop the startup and ask the user to remove (or rename) the damaged peers.dat?

Notice - if it is possible to remotely corrupt peers.dat, then currently master is even worse than this PR - reading a corrupted peers.dat and keep operating with a "corrupted" addrman could lead to undefined behavior.

@naumenkogs
Copy link
Member

With this change, if we have a bug that causes us to write a peers.dat file that's corrupt in any way, then it would mostly be undetected - the addrman would be cleared and startup would continue as normal. As @mzumsande points out, if it's possible to trigger such a bug over p2p, then an adversary could cause victims to lose their entire addrman on restart.

It might be safer to do the ForceCheck, but then catch the exception, rename peers.dat to peers.bak, and terminate with some error message like "Corrupt peers.dat file found. addrman database has been wiped and peers.dat has been renamed to peers.bak. Please report this to the Bitcoin Core developers." That way, we'd get bug reports if there was a bug in our serialization code, and there may be a way to salvage the peers.bak file if needed.

I agree with John.

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 21, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #22974 (addrman: Improve performance of Good by mzumsande)
  • #22950 ([p2p] Pimpl AddrMan to abstract implementation details by amitiuttarwar)
  • #22910 ([RFC] Encapsulate asmap in NetGroupManager by jnewbery)
  • #22872 (log: improve checkaddrman logging with duration in milliseconds by jonatack)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member Author

maflcko commented Aug 21, 2021

I agree with @vasild that this is an unrelated bug that already exists. I've created a separate pull to fix that: #22762

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4fd5305

CAddrMan::Unserialize() already, before this PR, could throw if it encounters a corrupted peers.dat. I think it is logical to do the same if the consistency checks fail after the unserialize process has finished "successfully". Also, it is important enough and not too expensive to do the consistency checks always, unconditionally, regardless of -checkaddrman at the end of unserialize.

(the fuzzer is upset, probably needs some petting)

@maflcko
Copy link
Member Author

maflcko commented Sep 15, 2021

(the fuzzer is upset, probably needs some petting)

Thanks. Force pushed to pet the fuzzer and fixup the docs. (no other code changes)

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa6f5c9

@maflcko
Copy link
Member Author

maflcko commented Sep 15, 2021

@mzumsande @naumenkogs @jnewbery Now that #22762 is merged, I rebased this pull. This should address your concerns.

@naumenkogs
Copy link
Member

ACK fa6f5c9

MarcoFalke added 4 commits September 21, 2021 10:07
This speeds up compilation of the whole program because the included
header file is smaller.

Can be reviewed with --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
Assert should only be used for program internal logic errors, not to
sanitize external user input.
Can be reviewed with --color-moved=dimmed-zebra
@maflcko
Copy link
Member Author

maflcko commented Sep 21, 2021

Rebased; Only trivial conflicts in the tests.

@naumenkogs
Copy link
Member

ACK fa3669f

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa3669f

@jnewbery
Copy link
Contributor

Code review ACK fa3669f

@maflcko maflcko merged commit a8a272a into bitcoin:master Sep 21, 2021
@maflcko maflcko deleted the 2108-addrmanNoCrash branch September 21, 2021 16:26
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 21, 2021
…k after deserialize

fa3669f fuzz: Move all addrman fuzz targets to one file (MarcoFalke)
fa7a883 addrman: Replace assert with throw on corrupt data (MarcoFalke)
fa29897 Refactor: Turn the internal addrman check helper into a forced check (MarcoFalke)
fae5c63 move-only: Move CAddrMan::Check to cpp file (MarcoFalke)

Pull request description:

  Assert should only be used for program internal logic errors, not to sanitize external user input.

  The assert was introduced via the debug-only runtime option `-checkaddrman` in commit 803ef70, thus won't need a backport.

  Also, it doesn't really make sense to continue when the deserialized addrman doesn't pass the sanity check.

  For example, if `nLastSuccess` is negative, it would  later result in integer overflows. Thus, this patch fixes bitcoin#22931.

  Also,
  Fixes bitcoin#22503
  Fixes bitcoin#22504
  Fixes bitcoin#22519

  Closes bitcoin#22498

  Steps to test:

  ```
  mkdir -p /tmp/test_235/regtest/
  echo 'H4sIAAAAAAAAA/u1f+stZmUGYgELgwPRakfBKBgFo2AUjIJRMApGwSgYBaNgFIyCUTBswdyGpFnLjUKjP9e0bvjYusl6b+L2e7Vs2dd6N//Pua0/xQUALJAn93IQAAA=' | base64 --decode | zcat > /tmp/test_235/regtest/peers.dat
  ./src/qt/bitcoin-qt -regtest -datadir=/tmp/test_235/ -checkaddrman=1 -printtoconsole | grep -A2 'Loading P2P addresses'
  ```

  Output before:
  ```
  2021-09-10T11:28:37Z init message: Loading P2P addresses…
  2021-09-10T11:28:37Z ADDRMAN CONSISTENCY CHECK FAILED!!! err=-16
  bitcoin-qt: addrman.cpp:765: void CAddrMan::Check() const: Assertion `false' failed.

  (program crashes)
  ```

  Output after:
  ```
  2021-09-10T11:26:00Z init message: Loading P2P addresses…
  2021-09-10T11:26:00Z Error: Invalid or corrupt peers.dat (Corrupt data. Consistency check failed with code -16: iostream error). If you believe this is a bug, please report it to https://github.com/bitcoin/bitcoin/issues. As a workaround, you can move the file ("/tmp/test_235/regtest/peers.dat") out of the way (rename, move, or delete) to have a new one created on the next start.

  (program exits)
  ```

ACKs for top commit:
  naumenkogs:
    ACK fa3669f
  jnewbery:
    Code review ACK fa3669f
  vasild:
    ACK fa3669f

Tree-SHA512: 687e4a4765bbc66495152fa7a49d28ee84b405dc5370ba87b4016b5593e45f54c4ce5cae579e4d433e0e082d20fc263969fa602679c911accef0adb2d6213bd6
fanquake added a commit that referenced this pull request Sep 23, 2021
…erflow:addrman.cpp"

fa5e8c1 Revert "test: Add missing suppression signed-integer-overflow:addrman.cpp" (MarcoFalke)

Pull request description:

  Forgot to do this in #22734

ACKs for top commit:
  fanquake:
    ACK fa5e8c1

Tree-SHA512: cabd741b73c828ae3c40df37d80022039e41e571fa196eebcb41e51c582064a1463a3677eb3b0e997ed141b32003802019474d2b739f0b606b4a16da4f585faa
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 24, 2021
…eger-overflow:addrman.cpp"

fa5e8c1 Revert "test: Add missing suppression signed-integer-overflow:addrman.cpp" (MarcoFalke)

Pull request description:

  Forgot to do this in bitcoin#22734

ACKs for top commit:
  fanquake:
    ACK fa5e8c1

Tree-SHA512: cabd741b73c828ae3c40df37d80022039e41e571fa196eebcb41e51c582064a1463a3677eb3b0e997ed141b32003802019474d2b739f0b606b4a16da4f585faa
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 19, 2022
Summary:
```
Assert should only be used for program internal logic errors, not to sanitize external user input.
The assert was introduced via the debug-only runtime option -checkaddrman in commit 803ef70, thus won't need a backport.
Also, it doesn't really make sense to continue when the deserialized addrman doesn't pass the sanity check.
For example, if nLastSuccess is negative, it would later result in integer overflows.
```

Backport of [[bitcoin/bitcoin#22734 | core#22734]].

Depends on D12310.

Test Plan:
  ninja all check-all

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Differential Revision: https://reviews.bitcoinabc.org/D12311
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
7 participants