-
Notifications
You must be signed in to change notification settings - Fork 37.7k
assumeutxo: background validation completion #25740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! chainstate only contains fully validated blocks and the other chainstate contains all blocks, | ||
//! even those assumed-valid. | ||
//! | ||
BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: add testcases for snapshot initialization" (5ff0d2f)
Something wonky is going on in this commit. It seems like some rebase error. This is dropping the chainstatemanager_loadblockindex
test, and replacing it with an exact copy of the chainstatemanager_snapshot_init
below, which now appears twice (and is edited in later commit 65ad086 to become a different chainstatemanager_snapshot_completion test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ryanofsky yeah, thanks - not sure what happened there. Should be fixed.
d4fc4ae
to
ffa698e
Compare
ffa698e
to
ad64bae
Compare
Clang failure on some platforms looks spurious:
but lock is taken here. Anyone have ideas for workarounds? |
If the lock is for |
It's protected 😬. But looks like I can use the Edit: that's private too, hah. |
ad64bae
to
93e6a23
Compare
93e6a23
to
2e8d153
Compare
In the pull description, could drop or update "Since it's dependent on the commits in #25667, I'm opening this as a draft." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK a00d0e7.
Since last ACK ab41e8b:
- move
nPruneTarget
usage toBlockManager::GetPruneTarget()
inLoadChainstate()
- bug report call message (i.e
PACKAGE_BUGREPORT
) inLogPrintf
inActivateBestChain()
- move
GetUTXOStats
mention toComputeUTXOStats
inMaybeCompleteSnapshotValidation
comment - new
BOOST_CHECK
s inchainstatemanager_snapshot_completion
about cache size bytes
I'd like to test this with |
Maintainers: what else would you like to see for a merge here? |
Waiting for a couple more reviews. |
Done, thanks. |
In the pull description, it still mention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crACK a00d0e7
In 5ee22cd, it creates ChainstateManager::GetSnapshotBaseBlock
and ChainstateManager::GetSnapshotBaseHeight
. GetSnapshotBaseHeight
is used (not in this commit) in MaybeCompleteSnapshotValidation
to check if IBD (in background) has been completed.
In c29f26b, it creates CChainState::m_disabled
and I could check it's been used to point that a chainstate should no longer be used by validation logic. (perhaps PR description should be updated).
Some logic I could understand:
In e91b216, it first added in CompleteChainstateInitialization
assert(chainman.m_total_coinstip_cache > 0);
assert(chainman.m_total_coinsdb_cache > 0);
Good to check we have enough bytes available for in-memory coins caches and leveldb stuff.
In LoadChainstate
I could check (what is mentioned):
- Loads the fully validated chainstate
- Loads a chain created from a UTXO snapshot, if any exist.
With both ones, it calls MaybeCompleteSnapshotValidation
to compare the background chainstate's UTXO hash with the hard-coded (and expected) one. If so, it sets m_disabled=true
. And then, we can _re-_initialize the chainstate (InitializeChainstate
).
If we could have loadtxoutset
(as mentioned by @Sjors), it would be good to review it and see things happen in practice.
Missed this on last review (sorry):
|
Trigger completion when a background validation chainstate reaches the same height as a UTXO snapshot, and handle cleaning up the chainstate on subsequent startup.
I found this useful during unittest debugging.
for use in the following unittests.
Also adjusts the previous snapshot chainstate init tests to account for the fact that the init process is now attempting to validate and complete background chainstates whose tip is at the snapshot base block. We use a DisconnectTip() hack to preserve the nature of the test.
Include notes about the `chainstate_snapshot` rename as well as updates for the included code.
a00d0e7
to
2b373fe
Compare
Fixed! |
ACK 2b373fe |
return SnapshotCompletionResult::STATS_FAILED; | ||
} | ||
|
||
// XXX note that this function is slow and will hold cs_main for potentially minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just catching up... Not sure if intentional or not but maybe the XXX here was something you still wanted to address but missed @jamesob ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some nits. Feel free to ignore
fs::path p_new, | ||
const fs::filesystem_error& err) { | ||
LogPrintf("%s: error renaming file (%s): %s\n", | ||
__func__, fs::PathToString(p_old), err.what()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit in d96c59c: No need to pass func
- This is already done by log internally
- This will print a useless
operator()
, twice:[validation.cpp:5712] [operator()] operator(): error renaming file
fs::rename(ibd_chainstate_path, tmp_old); | ||
} catch (const fs::filesystem_error& e) { | ||
rename_failed_abort(ibd_chainstate_path, tmp_old, e); | ||
throw; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit in d96c59c:
Looks like the LOC can be reduced by moving all of this into the lambda instead of repeating the passed paths twice and the throw logic twice.
auto dest_str = fs::PathToString(invalid_path); | ||
|
||
LogPrintf("%s: error renaming file '%s' -> '%s': %s\n", | ||
__func__, src_str, dest_str, e.what()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Same about __func__
"Please report this incident to %s, including how you obtained the snapshot. " | ||
"The invalid snapshot chainstate has been left on disk in case it is " | ||
"helpful in diagnosing the issue that caused this error."), | ||
PACKAGE_NAME, snapshot_tip_height, snapshot_base_height, snapshot_base_height, PACKAGE_BUGREPORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same commit: At the risk of people running into this, I wonder if this should use STR_INTERNAL_BUG()
so that more info is printed about the version/commit hash they were using.
// chainstate past the snapshot base block. | ||
if (WITH_LOCK(::cs_main, return m_disabled)) { | ||
LogPrintf("m_disabled is set - this chainstate should not be in operation. " /* Continued */ | ||
"Please report this as a bug. %s\n", PACKAGE_BUGREPORT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. (STR_INTERNAL_BUG(msg)
)
Summary: This is a backport of [[bitcoin/bitcoin#23154 | core#23154]] with updates from [[bitcoin/bitcoin#25667 | core#25667]] and [[bitcoin/bitcoin#25740 | core#25740]] Depends on D14655 Test Plan: proof-reading Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14656
Summary: and remove m_snapshot_validated. This state can now be inferred by the number of isUsable chainstates. m_disabled is used to signal that a chainstate should no longer be used by validation logic; it is used as a sentinel when background validation completes or if the snapshot chainstate is found to be invalid. isUsable is a convenience method that incorporates m_disabled. This is a partial backport of [[bitcoin/bitcoin#25740 | core#25740]] bitcoin/bitcoin@c29f26b Depends on D14673 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14685
Summary: Moves chainstate initialization into its own function. This is necessary to later support a more readable way of handling background-validation chainstate cleanup during init, since the chainstate initialization functions may need to be repeated after moving leveldb filesystem content around. This commit isn't strictly necessary, but the alternative is to (ab)use the `while` loop in init.cpp with a `continue` on the basis of a specific ChainstateLoadingError return value from LoadChainstate. Not only is this harder to read, but it can't be unittested. The approach here lets us consolidate background-validation cleanup to LoadChainstate, and therefore exercise it within tests. This commit is most easily reviewed with git diff --color-moved=dimmed-zebra --color-moved-ws=ignore-space-change This is a partial backport of [[bitcoin/bitcoin#25740 | core#25740]] bitcoin/bitcoin@f2a4f33 Depends on D14685 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14686
Summary: > add ChainstateManager.GetSnapshot{BaseHeight,BaseBlock}() > > For use in later commits. bitcoin/bitcoin@5ee22cd ---- > add Chainstate::HasCoinsViews() > > Used in subsequent commits. Also cleans up asserts in > coins_views-related convenience methods to be more exact. bitcoin/bitcoin@637a90b ---- > validation: add ChainMan logic for completing UTXO snapshot validation > > Trigger completion when a background validation chainstate reaches the > same height as a UTXO snapshot, and handle cleaning up the chainstate > on subsequent startup. bitcoin/bitcoin@d96c59c ---- > log: add LoadBlockIndex() message for assumedvalid blocks > > I found this useful during unittest debugging. bitcoin/bitcoin@7300ced ---- > refactor: make MempoolMutex() public > > for use in the following unittests. bitcoin/bitcoin@d70919a ---- > test: add snapshot completion unittests > > Also adjusts the previous snapshot chainstate init tests > to account for the fact that the init process is now attempting to > validate and complete background chainstates whose tip is at the > snapshot base block. We use a DisconnectTip() hack to preserve the > nature of the test. bitcoin/bitcoin@87a1108 ---- This completes backport of [[bitcoin/bitcoin#25740 | core#25740]] Depends on D14686 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14687
This is part of the assumeutxo project (parent PR: #15606)
Part two of replacing #24232.
When a user activates a snapshot, the serialized UTXO set data is used to create an "assumed-valid" chainstate, which becomes active in an attempt to get the node to network tip as quickly as possible. Simultaneously in the background, the already-existing chainstate continues "conventional" IBD to both accumulate full block data and serve as a belt-and-suspenders to validate the assumed-valid chainstate.
Once the background chainstate's tip reaches the base block of the snapshot used, we set
m_stop_use
on that chainstate and immediately take the hash of its UTXO set; we verify that this matches the assumeutxo value in the source code. Note that while we ultimately want to remove this background chainstate, we don't do so until the following initialization process, when we again check the UTXO set hash of the background chainstate, and if it continues to match, we remove the (now unnecessary) background chainstate, and move the (previously) assumed-valid chainstate into its place. We then reinitialize the chainstate in the normal way.As noted in previous comments, we could do the filesystem operations "inline" immediately when the background validation completes, but that's basically just an optimization that saves disk space until the next restart. It didn't strike me as worth the risk of moving chainstate data around on disk during runtime of the node, though maybe my concerns are overblown.
The final result of this completion process is a fully-validated chain, where the only evidence that the user synced using assumeutxo is the existence of a
base_blockhash
file in thechainstate
directory.