-
Notifications
You must be signed in to change notification settings - Fork 37.7k
validation: have LoadBlockIndex account for snapshot use #23174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
(this PR is missing in the Assume UTXO project) |
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After debug building rebased to master and an initial review, the code builds cleanly and makes sense, but I need to look more deeply and haven't yet checked if the change is covered by tests, so work-in-progress ACK 51e614f. A few minor comments, feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 51e614f. I left a lot of suggestions and some questions, but this change seems like a step in right direction and it shouldn't change behavior in any noticeable way when there's no snapshot loaded.
Forgot to post this earlier. I wrote up a description of this PR when I started reviewing it that may be useful to others (or helpful to me if I got something wrong): Currently, This PR changes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, as highlighted by other reviewers would be good to have unit tests to assert correctness of the changes.
51e614f
to
2d7ecfb
Compare
Hey, review works! Thanks to commentary from @ryanofsky @ariard, I've found that a lot of the original patch was unnecessary .I've pushed a much-simplified changeset that addresses the quandary I posted yesterday (#23174 (comment)). It turns out that the earlier version of this patch contained some unnecessary code that was a vestige from before I was persisting changes to the block index during snapshot activation (added in #21526). The new changeset just ensures that we don't add tip candidates which rely on assumed-valid blocks to the IBD chainstate. I think the changes are much more intuitive to follow, and a decent amount of code was dropped. For anyone who was in doubt: reviewing code pays off! |
// Pruned nodes may have deleted the block. | ||
if (pindex->nTx > 0) { | ||
if (pindex->pprev) { | ||
if (pindex->pprev->HaveTxsDownloaded()) { | ||
if (pindex->pprev->nChainTx > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this change is tautological, but done for clarity. HaveTxsDownloaded()
should probably be removed since its name conflicts with assumed-valid blocks that have had their nChainTx value set during snapshot activation.
2d7ecfb
to
dbca47e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK dbca47e. Much simpler now and looks all good to me!
In the PR description #23174 (comment):
When reconnecting the block index on startup, we have to account for the fact that assumed-valid blocks (per the snapshot chainstate) do not have nTx values. We can't exclude blocks from inclusion in
setBlockIndexCandidates
on this basis or we will trip over an empty set in an assertion later in startup.
I'm actually not sure what this is referring to, and it seems like it might not be applicable anymore according to #23174 (comment) if nTx values are set and the previous code is now reverted
We also have to ensure that blocks past the snapshot base block (i.e. the end of the assumed-valid region of the chain) are not included in setBlockIndexCandidates for the background validation chainstate.
The way this is phrased makes it sound like this change avoids adding blocks, when actually it is adding more blocks. Would maybe try to make it clearer this is now adding blocks for the background chainstate. Feel free to steal PR description I posted earlier too #23174 (comment) if that is useful
dbca47e
to
2833bd0
Compare
Man this changeset keeps getting smaller and smaller. Thanks @ryanofsky! I've reverted back to doing a stateful approach for assumed-valid detection in LoadBlockIndex since the more holistic approach of scanning ancestors ended up being prohibitively slow when loading the main chain. This has allowed me to drop a few more changes from the original patch. The only remaining item left is to write unittests for the contents of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 2833bd0. Changes since last review: Making PopulateAndValidateSnapshot loop from 1 instead of calling isGenesis. Also adding unit test. Making LoadBlockIndex use height comparison to determine assumed-valid blocks instead of traversing pprev.
re: #23174 (comment), #23174 (comment)
I edited PR description to drop the part about nChainTx and stop referring to the snapshot height since implementation was changed to use first assumed-valid height, not last assumed-valid height. Previous description was:
This is part of the assumeutxo project (parent PR: #15606)
(PR description by ryanofsky)
Currently, BlockManager::LoadBlockIndex adds all blocks that have downloaded transactions to the active chain state's setBlockIndexCandidates set, ignoring the background chain state.
This PR changes ChainstateManager::LoadBlockIndex to:
Update
setBlockIndexCandidates
in the background chain, not just the active chain. In the active chain, all blocks are added as before. In the background chain, only blocks below the snapshot height are added so the background chain will continue to download and validate them. Sets pindex->nChainTx of the snapshot block to the value from snapshot metadata instead of leaving it 0.
// the valid chain. | ||
for (CChainState* chainstate : chainman.GetAll()) { | ||
if (!seen_assumed_valid || chainstate->reliesOnAssumedValid()) { | ||
chainstate->setBlockIndexCandidates.insert(pindex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "validation: have LoadBlockIndex account for snapshot use" (2833bd0)
I still think it would be really nice to call this function from a unit test and make sure this set is filled correctly. The logic here is fragile, and testing this part of LoadBlockIndex
could make it easier to test other parts as well.
EDIT: Sorry, I see you mentioned working on this in #23174 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for chiming in on what is now a very old PR, but I believe this block of code is incorrect. If we have a snapshot in place (and so some block index is marked IsAssumedValid), but we have downloaded blocks that haven't yet been activated (eg on the background-validation chainstate), this logic prevents us from adding those downloaded-but-not-yet verified blocks to setBlockIndexCandidate
for the background chainstate.
I ran into this while working on a download implementation for assumeutxo, and in some sequence of node restarts I got stuck in a state where my node had downloaded a bunch of blocks but never activated them, because they weren't in setBlockIndexCandidates
. I thought that restarting should fix the issue, but it didn't due to this block of code not doing the right thing...
Admittedly there is a lot of complicated interaction here but I think there's an invariant for setBlockIndexCandidates
that is violated by this block of code, which is that if we have the data for a block and all its predecessors and it has more work than our tip, then it should be in setBlockIndexCandidates
. By excluding blocks that are marked IsAssumedValid() we violate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a snapshot in place (and so some block index is marked IsAssumedValid), but we have downloaded blocks that haven't yet been activated (eg on the background-validation chainstate), this logic prevents us from adding those downloaded-but-not-yet verified blocks to setBlockIndexCandidate for the background chainstate
I'm having a hard time following how this could happen. If the bg chainstate has some downloaded-but-not-activated blocks, they are definitionally beneath the first_assumed_valid_height
(since we don't attach blocks above that height to the bg chainstate). So any of those blocks should be added by this code to the bg chainstate's setBlockIndexCandidates
. Have you seen an example where you're sure that a downloaded block on the bg chain is excluded by this code?
I don't doubt that you're seeing an issue though... I wonder if there might be a problem with how m_best_header
is set below? I.e. that state isn't qualified per chainstate. That's something I'll investigate now.
I'd also be curious if anyone else has seen this issue when testing the assumeutxo tracking branch (#27596). Is there a chance some code you've added is causing this?
By excluding blocks that are marked IsAssumedValid() we violate this.
Note that we're only excluding blocks marked assumedvalid (by way of the height check) for the bg chainstate, which still seems sensible to me. All blocks, assumedvalid or not, are added to the snapshot chainstate (or "regular" chain if no snapshot is in use).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are definitionally beneath the first_assumed_valid_height (since we don't attach blocks above that height to the bg chainstate)
I don't believe this is true; after we download a block and before we validate the block, the CBlockIndex entry for that block will have the AssumedValid bit set to true. This means that if you were to download a bunch of blocks and then shutdown prior to those blocks being validated on the bg chainstate, that when you restart you'll be in this state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's no problem for them to have the ASSUMED_VALID bit set to true; since they're beneath the snapshot base height, they'll be added to the setBlockIndexCandidates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here isn't comparing with the snapshot base height; it's just checking if a block index entry is below the first_assumed_valid_height
, and only in that case does it get added to setBlockIndexCandidates
. [I don't believe there's any other logic elsewhere that will do what you are describing, of adding entries below the snapshot base height to setBlockIndexCandidates
?]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More generally: I think that we don't need to worry at all about whether the assumed-valid bit is set when adding entries to setBlockIndexCandidates
. We always validate a block when it gets connected, so the only invariant we need on entries in setBlockIndexCandidates
is whether they have enough work to be worth considering. (We used to have a hard requirement that we always HAVE_DATA for such entries, but after pruning we had to relax that a bit, as there are edge-case scenarios where you might HAVE_DATA for such an entry and then prune it before you try to connect it, so we handle that case by (re-)adding such entries to mapBlocksUnlinked
when they occur. But I digress...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here isn't comparing with the snapshot base height; it's just checking if a block index entry is below the first_assumed_valid_height
Oops, total misread on my part. You're right; first_assumed_valid_height
will be the block after the tip of the background chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here does seem confusing.
Code comment says background candidate set needs to exclude BLOCK_ASSUMED_VALID blocks because otherwise ActivateBestChain "would add assumed-valid blocks to the chain". But that seems like exactly what it should do? (after validating them)
Commit message says background candidate set needs to exclude blocks after the snapshot height that don't have BLOCK_ASSUMED_VALID, but depend on blocks that do have BLOCK_ASSUMED_VALID. That idea does make sense, but does not seem to be what is implemented.
Taking another a step back, it seems like all of this would be simpler if background ActivateBestChain function just skipped calling FindMostWorkChain and tried to connect blocks up to the base of the snapshot.
5b6a20b
to
bf38154
Compare
@ryanofsky nice timing. I just pushed a unittest for |
dbea156
to
e35d167
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK e35d167. Main change since last review was just adding test. And some suggestions from the last review were also applied.
This PR is now a lot simpler than it was at previous points, so it should be pretty easy for previous reviewers to reack, or new reviewers to take a look at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly patchset is far simpler, just left few minor comments to confirm my understanding of the changes.
src/validation.cpp
Outdated
// Fill each chainstate's block candidate set. Add all blocks as | ||
// candidates if the chainstate is allowed to rely on assumed-valid | ||
// blocks. Otherwise avoid adding assumed-valid blocks so the | ||
// chainstate will download and validate all earlier blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment can be slightly modified to mention the descendants of assumed-valid blocks, like in the commit description.
"Otherwise avoid adding assumed-valid blocks and their fully-validated descendants therefore inducing the chainstate to download and validate all blocks earlier than the first assumed-valid block".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow your suggested comment, since fully-validated blocks should be added to both chainstates, but I've updated the comment to be try and be clearer.
e35d167
to
bc3288b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK bc3288b. Only change since previous review was fixing up a code comment to be more clear about which chainstate it was referring to
7b08baf
to
7a576ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 7a576ba. Only change since previous review is adding new asserts and some tweaks in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the code and it looks very good to me with one potential optimization I suggested below.
I also did some mutation testing on the changes and discovered that the test in the first commit doesn't seem to cover the change. Unless I made some error here myself, the test should probably be fixed before merging.
7a576ba
to
709a09c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 709a09c. Only change since last review is test change moving genesis not assume-valid check after the successful CreateAndActivateUTXOSnapshot call instead of before, so it is actually meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK 709a09c, did not look at the last commit (test) 🥇
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Concept ACK 709a09c9b6a902d5ed0e5ad6ac421b17c8921e6e, did not look at the last commit (test) 🥇
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUggmQv+IHs5tQZR4a2VV+VvpfYa4GEdcIbiiJjRJqReLcICNClFReu5hpKfkKm4
8VbU64iwXezNAeAzlJYUB7vWMKdGlHMMYX4V9fpJ6zwGkl38eL8Xq7kzHLUDH6iO
FfC32v0nywoz2FcnT5sFtAodnTl8kR01lQOCZSUvsvHnrUiYh6TGZt0Xlt1lz8Sh
2/Orx7mEYlmpR9lDPLrU6DJscUm12r5hE/PQnLC0iJ+nmc/kpKGV6S25fR08E0Id
JfPUnpzetTdiJGgjx3Zi9fau0ROmadXYPRJw8LSLm5cWCTxAt32oo4OQnoQeILy2
pmp76uQGwEnnD4vytiPg4J2ZBQUIzYz+X3f1DhWJhbDplfAcR1pKHO/f1itCqUhu
CUw5IbT1Wu4ocMepisosddMuoIuYSbb3Z3QXizXZHnD6E7WWKa1oNherTzX3IaFM
RW+rKuEZ0n1Wa0Sv5NV8OxxPQYLDwPXopHHz1K9Mz4/DJV6eOE9Q5+l1RtkpH9mr
0nul+aMW
=nvxK
-----END PGP SIGNATURE-----
src/validation.cpp
Outdated
// assert that the assumed valid blocks will be validated by some not assumed-valid | ||
// chainstate. | ||
assert(any_of(chainstates.cbegin(), chainstates.cend(), | ||
[](auto chainstate) { return !chainstate->reliesOnAssumedValid(); })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in commit 36e2bc4:
Can you explain the asserts a bit better? Currently it looks like they are asserting that two chainstates exist. This condition should be independent of the loop and could be moved outside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition should be independent of the loop and could be moved outside?
I don't see why you say that because (i) this condition should only be asserted if we encounter an assumed-valid block, which we need to loop through the index to find out, and (ii) the condition is only asserted once since we break immediately afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored this code and updated the commentary; hopefully you find it more intuitive now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With "condition" I mean "two chainstates exist".
Thus a loop of the form:
for b in blocks:
if b.assumed:
assert len(chains())==2
Can be rewritten into:
have_two_chains = len(chains())==2
for b in blocks:
if b.assumed:
assert have_two_chains
Obviously the performance should be the same here, but evaluating the condition outside the loop might be clearer. It would even allow to add additional checks. (Like asserting that no block is assumed if the number of chains is one?)
// beginning before the first assumed-valid block) might not get | ||
// added to the the background chainstate, but this is ok, because | ||
// they will still be attached to the active chainstate if they | ||
// actually contain more work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in commit 36e2bc4:
I don't understand why it is safe to not attach them. If the assumed valid block is sufficiently old, it may cost little POW to create an alternative fork up to that height, and confusing the background chainstate that it can't connect to the active chainstate?
What is the reason for the if-condition in the first place? What would happen if both candidate sets are fully populated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both candidate sets are fully populated, when ActivateBestChain is called on the background validation chainstate, FindMostWorkChain()
will return the assumed-valid block index entry as installed by snapshot activation, and it won't think it has to enter IBD to download and fully validate the chain up to the snapshot base block. I'll include this in the commentary above since people (myself included) seem to constantly forget why we need this conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the assumed valid block is sufficiently old, it may cost little POW to create an alternative fork up to that height, and confusing the background chainstate that it can't connect to the active chainstate?
I'm confused by this because the background chainstate shouldn't be considering forks anyway; its purpose is to verify the assumed-valid blocks beneath the snapshot and it does not handle reorgs or forks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess I am missing context on how the download logic will be modified for assumevalid.
If the assumevalid chain is assumed to be always linear, then maybe a check that all assumed blocks are parents of the assumed tip block makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK for the first commit 9be75f2
One suggestion to make the second easier to review...
Avoid modifying the genesis block index entry during snapshot load. This is because, in a future change that fixes LoadBlockIndex for UTXO snapshots, we detect block index entries that are reliant on assumed-valid ancestors and treat them specially. Since the genesis block doesn't have BLOCK_VALID_SCRIPTS, it would be erroneously marked BLOCK_ASSUMED_VALID during snapshot load if we didn't skip it here. This would cause a "setBlockIndexCandidates() empty" assertion to be tripped since all block index entries would be marked assume-valid due to genesis, which is never re-validated. There's probably no good reason to modify the genesis block index entry during snapshot load anyway...
Ensure that blocks past the snapshot base block (i.e. the end of the assumed-valid region of the chain) are not included in setBlockIndexCandidates for the background validation chainstate. These blocks, while fully validated and lacking the BLOCK_ASSUMED_VALID flag, *rely* on blocks which are assumed-valid, and so shouldn't be added to the IBD chainstate. Co-authored-by: Russ Yanofsky <russ@yanofsky.org>
Incorporates feedback from Russ Yanofsky.
709a09c
to
2283b9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 2283b9c
@@ -4918,7 +4925,7 @@ bool ChainstateManager::PopulateAndValidateSnapshot( | |||
index->nTx = 1; | |||
} | |||
// Fake nChainTx so that GuessVerificationProgress reports accurately | |||
index->nChainTx = index->pprev ? index->pprev->nChainTx + index->nTx : 1; | |||
index->nChainTx = index->pprev->nChainTx + index->nTx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d0c6e61: I like @MarcoFalke's suggestion of using Assert(index->pprev)
instead.
auto index = cs1.m_chain[i]; | ||
|
||
if (i < last_assumed_valid_idx && i >= assumed_valid_start_idx) { | ||
index->nStatus = BlockStatus::BLOCK_VALID_TREE | BlockStatus::BLOCK_ASSUMED_VALID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add:
BOOST_CHECK(index->IsAssumedValid());
} else {
BOOST_CHECK(index->!IsAssumedValid());
Concept ACK 2283b9c 🤽 Show signatureSignature:
|
It looks like this PR has had a good review cycle and there are no critical issues left. I think we should move toward a final round of ACKs and merge it. Smaller suggestions and nits that come up now can be addressed later. |
Ok, I'll open a follow-up next week or so, unless someone beats me to it. |
Summary: Avoid modifying the genesis block index entry during snapshot load. This is because, in a future change that fixes LoadBlockIndex for UTXO snapshots, we detect block index entries that are reliant on assumed-valid ancestors and treat them specially. Since the genesis block doesn't have BLOCK_VALID_SCRIPTS, it would be erroneously marked BLOCK_ASSUMED_VALID during snapshot load if we didn't skip it here. This would cause a "setBlockIndexCandidates() empty" assertion to be tripped since all block index entries would be marked assume-valid due to genesis, which is never re-validated. There's probably no good reason to modify the genesis block index entry during snapshot load anyway... This is a partial backport of [[bitcoin/bitcoin#23174 | core#23174]] bitcoin/bitcoin@d0c6e61 Depends on D12301 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D12303
This is part of the assumeutxo project (parent PR: #15606)
Currently,
BlockManager::LoadBlockIndex
adds all blocks that have downloaded transactions to the active chain state'ssetBlockIndexCandidates
set, ignoring the background chain state.This PR changes ChainstateManager::LoadBlockIndex to update
setBlockIndexCandidates
in the background chain, not just the active chain. In the active chain, the same blocks are added as before. In the background chain, only blocks that have actually been validated, not blocks marked assumed-valid are added so the background chain will continue to download and validate assumed-valid blocks.