Skip to content

Conversation

jnewbery
Copy link
Contributor

We do not mark any blocks that fail CheckBlock() as BLOCK_FAILED_VALID
since they could have been mutated and marking a valid-but-mutated block
as invalid would prevent us from ever syncing to that chain. See
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html
for full details.

The current guard against marking CheckBlock() failed blocks as invalid
is by calling CheckBlock() prior to AcceptBlock() in ProcessNewBlock().
That is brittle since AcceptBlock() has an implicit assumption that any
block submitted has been checked for mutation. A future change to
ProcessNewBlock() could overlook that implicit assumption and introduce
a consensus failure.

Move the mutation guard logic into AcceptBlock() and
add comments to explain why we never mark CheckBlock() failed blocks as
invalid.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 25, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Maybe hold for follow-up but did you look to remove CheckBlockHeader from CheckBlock given it's called now in AcceptBlockHeader which is call in AcceptBlock ? I think that's okay if we add a CheckBlockHeader in TestBlockValidity.

On the DoS-side we now have AcceptBlockHeader called before CheckBlock, I think that's better given block header checks are cheaper than block ones.

@@ -3769,18 +3781,9 @@ bool ProcessNewBlock(const CChainParams& chainparams, const std::shared_ptr<cons
if (fNewBlock) *fNewBlock = false;
BlockValidationState state;

// CheckBlock() does not support multi-threaded block validation because CBlock::fChecked can cause data race.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you keep CheckBlock as it is can we keep this to inform future refactors ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should either:

  • be a comment in CheckBlock saying that cs_main should usually be held when calling CheckBlock
  • add an AssertLockHeld(cs_main) to CheckBlock (and also add that to FillBlock and update the callers in bench to hold the lock)
  • remove fChecked so that CheckBlock doesn't need to hold cs_main

// of block mutation that cause CheckBlock() to fail; see e.g.
// CVE-2012-2459 and
// https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html.
// Because CheckBlock() is not very expensive, the anti-DoS benefits of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said here than not caching failure is okay because CheckBlock() isn't very expensive but at same time we use fChecked to return early to avoid reprocessing. It seems a bit an inconsistent position. If it's cheap enough we shouldn't bother with fChecked and lock tacking shouldn't cover CheckBlock? Or we could split CheckBlock between CheckBlockIntegrity and CheckBlockValidity and have a fDefinitelyInvalid to skip both if see block again?

Copy link
Contributor Author

@jnewbery jnewbery Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fChecked is a slightly different caching mechanism. It's stored on the CBlock object and prevents having to do the merkle root and POW checking for the same object more than once. The CBlock object is new each time we redownload a block, so this caching doesn't prevent us from re-checking an invalid block downloaded more than once. On the other hand, the BlockManager.m_block_index is what prevents us from checking an invalid block downloaded a second time.

We may actually be able to remove fChecked after this PR. Before this PR, we call CheckBlock three times on the same block:

  1. from ProcessNewBlock()
  2. from AcceptBlock()
  3. from ConnectBlock()

(1) is removed by this PR and (3) is not on the critical path for compact block relay (since we relay the compact block as soon as we've done the merkle tree/pow checks the first time, and before we save to disk or connect the block).

LOCK(cs_main);

BlockValidationState state;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keep struct construction outside of cs_main, that's still a concurrency

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's still a concurrency

What you mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ariard means that there's no need to construct the BlockValidationState object within the cs_main scope. That's true, but constructing this object is very cheap, so I don't think it's a problem (and placing the variable declaration next to where it's used make this clearer).

I'm actually going to remove this commit from the PR, since I don't think it's necessary (and may make it more likely to introduce a bug, since the callers to ProcessNewBlock() in net_processing don't check the return code of the function before using the fNewBlock out param.

Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

LOCK(cs_main);

BlockValidationState state;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's still a concurrency

What you mean?

We do not mark any blocks that fail CheckBlock() as BLOCK_FAILED_VALID
since they could have been mutated and marking a valid-but-mutated block
as invalid would prevent us from ever syncing to that chain. See
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html
for full details.

The current guard against marking CheckBlock() failed blocks as invalid
is by calling CheckBlock() prior to AcceptBlock() in ProcessNewBlock().
That is brittle since AcceptBlock() has an implicit assumption that any
block submitted has been checked for mutation. A future change to
ProcessNewBlock() could overlook that implicit assumption and introduce
a consensus failure.

In this commit we move the mutation guard logic into AcceptBlock() and
add comments to explain why we never mark CheckBlock() failed blocks as
invalid.
@jnewbery jnewbery force-pushed the 2019-11-checkblock-in-acceptblock branch from c433cc4 to eb3b20e Compare March 13, 2020 17:03
@jnewbery
Copy link
Contributor Author

Rebased on latest master and removed the final commit.

@ariard

Maybe hold for follow-up but did you look to remove CheckBlockHeader from CheckBlock

As you note, if we did that, we'd need to add CheckBlockHeader() to TestBlockValidity() and also CVerifyDB::VerifyDB(). I don't think that counts as a simplification.

@jnewbery
Copy link
Contributor Author

jnewbery commented Jul 9, 2020

Closing this for now. I think it's the right change, but it's not high priority and there doesn't seem to be much interest.

@jnewbery jnewbery closed this Jul 9, 2020
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a good change, as well as, in increasing order of preference, the items in your comment #17601.

ACK eb3b20e code review, built and tested after rebasing onto master.

@jonatack
Copy link
Member

jonatack commented Jul 9, 2020

GitHub doesn't seem to parse its own link #17601 (https://github.com/bitcoin/bitcoin/pull/17601/#discussion_r392311705) correctly, so I meant the items in the following comment, in increasing order of preference:

I think this should either:

* be a comment in `CheckBlock` saying that `cs_main` should usually be held when calling `CheckBlock`

* add an `AssertLockHeld(cs_main)` to `CheckBlock` (and also add that to `FillBlock` and update the callers in bench to hold the lock)

* remove `fChecked` so that `CheckBlock` doesn't need to hold `cs_main`

@jonatack
Copy link
Member

jonatack commented Jul 9, 2020

I think it's the right change, but it's not high priority and there doesn't seem to be much interest.

I empathise with this sentiment and have closed my own pull requests for the same reason, combined with the growing stack of open PRs needing attention and thinking I shouldn't have too many PRs open. That said, unless a squeaky wheel calls for grease, review attention seems to be flood-or-drought. Dormant, then suddenly, unexpectedly present. So I guess what matters most is if the PR author is still interested in their own PR.

@jnewbery
Copy link
Contributor Author

jnewbery commented Jul 9, 2020

This is consensus-critical and the changing the ordering of checks here could potentially introduce very subtle consensus failures, and so this PR requires very careful review. I have 9 other PRs open and more branches that I haven't PRed and I'd prefer to focus review attention on those.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants