Validation: Move CheckBlock() mutation guard to AcceptBlock() #17601

jnewbery · 2019-11-25T20:37:30Z

We do not mark any blocks that fail CheckBlock() as BLOCK_FAILED_VALID
since they could have been mutated and marking a valid-but-mutated block
as invalid would prevent us from ever syncing to that chain. See
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html
for full details.

The current guard against marking CheckBlock() failed blocks as invalid
is by calling CheckBlock() prior to AcceptBlock() in ProcessNewBlock().
That is brittle since AcceptBlock() has an implicit assumption that any
block submitted has been checked for mutation. A future change to
ProcessNewBlock() could overlook that implicit assumption and introduce
a consensus failure.

Move the mutation guard logic into AcceptBlock() and
add comments to explain why we never mark CheckBlock() failed blocks as
invalid.

DrahtBot · 2019-11-25T21:37:59Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

validation: Warm coins cache during prevalidation to connect blocks faster #19271 (validation: Warm coins cache during prevalidation to connect blocks faster by andrewtoth)
[doc] explain why CheckBlock() is called before AcceptBlock #15545 ([doc] explain why CheckBlock() is called before AcceptBlock by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ariard

Concept ACK.

Maybe hold for follow-up but did you look to remove CheckBlockHeader from CheckBlock given it's called now in AcceptBlockHeader which is call in AcceptBlock ? I think that's okay if we add a CheckBlockHeader in TestBlockValidity.

On the DoS-side we now have AcceptBlockHeader called before CheckBlock, I think that's better given block header checks are cheaper than block ones.

ariard · 2019-12-05T18:26:23Z

src/validation.cpp

@@ -3769,18 +3781,9 @@ bool ProcessNewBlock(const CChainParams& chainparams, const std::shared_ptr<cons
        if (fNewBlock) *fNewBlock = false;
        BlockValidationState state;

-        // CheckBlock() does not support multi-threaded block validation because CBlock::fChecked can cause data race.


If you keep CheckBlock as it is can we keep this to inform future refactors ?

I think this should either:

be a comment in CheckBlock saying that cs_main should usually be held when calling CheckBlock

add an AssertLockHeld(cs_main) to CheckBlock (and also add that to FillBlock and update the callers in bench to hold the lock)

remove fChecked so that CheckBlock doesn't need to hold cs_main

ariard · 2019-12-05T18:39:16Z

src/validation.cpp

+        // of block mutation that cause CheckBlock() to fail; see e.g.
+        // CVE-2012-2459 and
+        // https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html.
+        // Because CheckBlock() is not very expensive, the anti-DoS benefits of


You said here than not caching failure is okay because CheckBlock() isn't very expensive but at same time we use fChecked to return early to avoid reprocessing. It seems a bit an inconsistent position. If it's cheap enough we shouldn't bother with fChecked and lock tacking shouldn't cover CheckBlock? Or we could split CheckBlock between CheckBlockIntegrity and CheckBlockValidity and have a fDefinitelyInvalid to skip both if see block again?

fChecked is a slightly different caching mechanism. It's stored on the CBlock object and prevents having to do the merkle root and POW checking for the same object more than once. The CBlock object is new each time we redownload a block, so this caching doesn't prevent us from re-checking an invalid block downloaded more than once. On the other hand, the BlockManager.m_block_index is what prevents us from checking an invalid block downloaded a second time.

We may actually be able to remove fChecked after this PR. Before this PR, we call CheckBlock three times on the same block:

from ProcessNewBlock()

from AcceptBlock()

from ConnectBlock()

(1) is removed by this PR and (3) is not on the critical path for compact block relay (since we relay the compact block as soon as we've done the merkle tree/pow checks the first time, and before we save to disk or connect the block).

ariard · 2019-12-05T18:42:53Z

src/validation.cpp

        LOCK(cs_main);

+        BlockValidationState state;


nit: keep struct construction outside of cs_main, that's still a concurrency

that's still a concurrency

What you mean?

I think @ariard means that there's no need to construct the BlockValidationState object within the cs_main scope. That's true, but constructing this object is very cheap, so I don't think it's a problem (and placing the variable declaration next to where it's used make this clearer).

I'm actually going to remove this commit from the PR, since I don't think it's necessary (and may make it more likely to introduce a bug, since the callers to ProcessNewBlock() in net_processing don't check the return code of the function before using the fNewBlock out param.

promag

Concept ACK.

promag · 2019-12-06T08:08:05Z

src/validation.cpp

        LOCK(cs_main);

+        BlockValidationState state;


that's still a concurrency

What you mean?

We do not mark any blocks that fail CheckBlock() as BLOCK_FAILED_VALID since they could have been mutated and marking a valid-but-mutated block as invalid would prevent us from ever syncing to that chain. See https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html for full details. The current guard against marking CheckBlock() failed blocks as invalid is by calling CheckBlock() prior to AcceptBlock() in ProcessNewBlock(). That is brittle since AcceptBlock() has an implicit assumption that any block submitted has been checked for mutation. A future change to ProcessNewBlock() could overlook that implicit assumption and introduce a consensus failure. In this commit we move the mutation guard logic into AcceptBlock() and add comments to explain why we never mark CheckBlock() failed blocks as invalid.

jnewbery · 2020-03-13T17:04:13Z

Rebased on latest master and removed the final commit.

@ariard

Maybe hold for follow-up but did you look to remove CheckBlockHeader from CheckBlock

As you note, if we did that, we'd need to add CheckBlockHeader() to TestBlockValidity() and also CVerifyDB::VerifyDB(). I don't think that counts as a simplification.

jnewbery · 2020-07-09T05:54:07Z

Closing this for now. I think it's the right change, but it's not high priority and there doesn't seem to be much interest.

jonatack

I agree this is a good change, as well as, in increasing order of preference, the items in your comment #17601.

ACK eb3b20e code review, built and tested after rebasing onto master.

jonatack · 2020-07-09T09:08:02Z

GitHub doesn't seem to parse its own link #17601 (https://github.com/bitcoin/bitcoin/pull/17601/#discussion_r392311705) correctly, so I meant the items in the following comment, in increasing order of preference:

I think this should either:

* be a comment in `CheckBlock` saying that `cs_main` should usually be held when calling `CheckBlock`

* add an `AssertLockHeld(cs_main)` to `CheckBlock` (and also add that to `FillBlock` and update the callers in bench to hold the lock)

* remove `fChecked` so that `CheckBlock` doesn't need to hold `cs_main`

jonatack · 2020-07-09T09:25:07Z

I think it's the right change, but it's not high priority and there doesn't seem to be much interest.

I empathise with this sentiment and have closed my own pull requests for the same reason, combined with the growing stack of open PRs needing attention and thinking I shouldn't have too many PRs open. That said, unless a squeaky wheel calls for grease, review attention seems to be flood-or-drought. Dormant, then suddenly, unexpectedly present. So I guess what matters most is if the PR author is still interested in their own PR.

jnewbery · 2020-07-09T09:35:45Z

This is consensus-critical and the changing the ordering of checks here could potentially introduce very subtle consensus failures, and so this PR requires very careful review. I have 9 other PRs open and more branches that I haven't PRed and I'd prefer to focus review attention on those.

fanquake added the Validation label Nov 25, 2019

ariard reviewed Dec 5, 2019

View reviewed changes

promag reviewed Dec 6, 2019

View reviewed changes

src/validation.cpp Outdated

LOCK(cs_main);

BlockValidationState state;

Copy link

Contributor

promag Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's still a concurrency

What you mean?

jnewbery added 2 commits March 13, 2020 13:01

[validation] Remove unused pindex outparam from AcceptBlock()

eb3b20e

jnewbery force-pushed the 2019-11-checkblock-in-acceptblock branch from c433cc4 to eb3b20e Compare March 13, 2020 17:03

DrahtBot mentioned this pull request Apr 14, 2020

assumeutxo #15606

Closed

18 tasks

DrahtBot mentioned this pull request Jun 14, 2020

validation: Warm coins cache during prevalidation to connect blocks faster #19271

Closed

jnewbery closed this Jul 9, 2020

jonatack reviewed Jul 9, 2020

View reviewed changes

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

Validation: Move CheckBlock() mutation guard to AcceptBlock() #17601

Validation: Move CheckBlock() mutation guard to AcceptBlock() #17601

Uh oh!

Conversation

jnewbery commented Nov 25, 2019

Uh oh!

DrahtBot commented Nov 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

ariard Dec 5, 2019

Choose a reason for hiding this comment

Uh oh!

jnewbery Mar 13, 2020

Choose a reason for hiding this comment

Uh oh!

ariard Dec 5, 2019

Choose a reason for hiding this comment

Uh oh!

jnewbery Mar 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ariard Dec 5, 2019

Choose a reason for hiding this comment

Uh oh!

promag Dec 6, 2019

Choose a reason for hiding this comment

Uh oh!

jnewbery Mar 13, 2020

Choose a reason for hiding this comment

Uh oh!

promag left a comment

Choose a reason for hiding this comment

Uh oh!

promag Dec 6, 2019

Choose a reason for hiding this comment

Uh oh!

jnewbery commented Mar 13, 2020

Uh oh!

jnewbery commented Jul 9, 2020

Uh oh!

jonatack left a comment

Choose a reason for hiding this comment

Uh oh!

jonatack commented Jul 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonatack commented Jul 9, 2020

Uh oh!

jnewbery commented Jul 9, 2020

Uh oh!

Uh oh!

DrahtBot commented Nov 25, 2019 •

edited

Loading

jnewbery Mar 13, 2020 •

edited

Loading

jonatack commented Jul 9, 2020 •

edited

Loading