assumeutxo: background validation completion #25740

jamesob · 2022-07-29T14:52:11Z

This is part of the assumeutxo project (parent PR: #15606)

Part two of replacing #24232.

When a user activates a snapshot, the serialized UTXO set data is used to create an "assumed-valid" chainstate, which becomes active in an attempt to get the node to network tip as quickly as possible. Simultaneously in the background, the already-existing chainstate continues "conventional" IBD to both accumulate full block data and serve as a belt-and-suspenders to validate the assumed-valid chainstate.

Once the background chainstate's tip reaches the base block of the snapshot used, we set m_stop_use on that chainstate and immediately take the hash of its UTXO set; we verify that this matches the assumeutxo value in the source code. Note that while we ultimately want to remove this background chainstate, we don't do so until the following initialization process, when we again check the UTXO set hash of the background chainstate, and if it continues to match, we remove the (now unnecessary) background chainstate, and move the (previously) assumed-valid chainstate into its place. We then reinitialize the chainstate in the normal way.

As noted in previous comments, we could do the filesystem operations "inline" immediately when the background validation completes, but that's basically just an optimization that saves disk space until the next restart. It didn't strike me as worth the risk of moving chainstate data around on disk during runtime of the node, though maybe my concerns are overblown.

The final result of this completion process is a fully-validated chain, where the only evidence that the user synced using assumeutxo is the existence of a base_blockhash file in the chainstate directory.

DrahtBot · 2022-07-29T21:47:10Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	achow101
Stale ACK	ryanofsky, ariard, brunoerg

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#26857 (OP_VAULT draft by jamesob)
#25977 (refactor: Replace std::optional<bilingual_str> with util::Result by ryanofsky)
#25722 (refactor: Use util::Result class for wallet loading by ryanofsky)
#25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)
#24008 (assumeutxo: net_processing changes by jamesob)
#15606 (assumeutxo by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jonatack · 2022-07-30T08:57:29Z

test/validation_chainstatemanager_tests.cpp:418:35: error: calling function 'DisconnectTip' requires holding mutex 'bg_chainstate.m_mempool->cs' exclusively [-Werror,-Wthread-safety-precise]
        BOOST_CHECK(bg_chainstate.DisconnectTip(unused_state, &unused_pool));
                                  ^
test/validation_chainstatemanager_tests.cpp:418:35: note: found near match 'mempool->cs'
1 error generated.

ryanofsky

Code review ACK d4fc4ae if missing chainstatemanager_loadblockindex test is added back. First 12 commits were previously reviewed as part of #25667. Later commits were cherry-picked from #24232 without code changes

ryanofsky · 2022-08-01T16:26:47Z

src/test/validation_chainstatemanager_tests.cpp

-//!   chainstate only contains fully validated blocks and the other chainstate contains all blocks,
-//!   even those assumed-valid.
-//!
-BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)


In commit "test: add testcases for snapshot initialization" (5ff0d2f)

Something wonky is going on in this commit. It seems like some rebase error. This is dropping the chainstatemanager_loadblockindex test, and replacing it with an exact copy of the chainstatemanager_snapshot_init below, which now appears twice (and is edited in later commit 65ad086 to become a different chainstatemanager_snapshot_completion test)

@ryanofsky yeah, thanks - not sure what happened there. Should be fixed.

jamesob · 2022-09-06T22:47:36Z

Clang failure on some platforms looks spurious:

test/validation_chainstatemanager_tests.cpp:501:35: error: calling function 'DisconnectTip' requires holding mutex 'bg_chainstate.m_mempool->cs' exclusively [-Werror,-Wthread-safety-precise]
        BOOST_CHECK(bg_chainstate.DisconnectTip(unused_state, &unused_pool));
                                  ^
test/validation_chainstatemanager_tests.cpp:501:35: note: found near match 'mempool->cs'
1 error generated.

but lock is taken here.

Anyone have ideas for workarounds?

ryanofsky · 2022-09-06T22:52:28Z

If the lock is for bg_chainstate.GetMempool()->cs but the annotation requires bg_chainstate.m_mempool->cs, the compiler probably isn't going to know that GetMempool() and m_mempool are the same. Probably should stick with m_mempool

jamesob · 2022-09-06T23:02:05Z

Probably should stick with m_mempool

It's protected 😬. But looks like I can use the RETURNS_LOCK annotation on MempoolMutex().

Edit: that's private too, hah.

jonatack · 2023-02-22T19:14:10Z

In the pull description, could drop or update "Since it's dependent on the commits in #25667, I'm opening this as a draft."

ariard

ACK a00d0e7.

Since last ACK ab41e8b:

move nPruneTarget usage to BlockManager::GetPruneTarget() in LoadChainstate()
bug report call message (i.e PACKAGE_BUGREPORT) in LogPrintf in ActivateBestChain()
move GetUTXOStats mention to ComputeUTXOStats in MaybeCompleteSnapshotValidation comment
new BOOST_CHECKs in chainstatemanager_snapshot_completion about cache size bytes

Sjors · 2023-02-23T18:44:57Z

I'd like to test this with loadtxoutset. I wonder if I can just cherry-pick a few things from #15606 to do that myself, or if a more rigorous rebase is needed...

jamesob · 2023-02-28T16:58:53Z

Maintainers: what else would you like to see for a merge here?

achow101 · 2023-02-28T17:43:42Z

Maintainers: what else would you like to see for a merge here?

Waiting for a couple more reviews.

jamesob · 2023-03-02T19:17:50Z

In the pull description, could drop or update "Since it's dependent on the commits in #25667, I'm opening this as a draft."

Done, thanks.

brunoerg · 2023-03-03T18:29:06Z

The commit message of 287bb3f "validation: add CChainState::m_disabled and ChainMan::isUsable" still mentions m_stop_use rather than m_disabled

In the pull description, it still mention m_stop_use.

brunoerg

crACK a00d0e7

In 5ee22cd, it creates ChainstateManager::GetSnapshotBaseBlock and ChainstateManager::GetSnapshotBaseHeight. GetSnapshotBaseHeight is used (not in this commit) in MaybeCompleteSnapshotValidation to check if IBD (in background) has been completed.

In c29f26b, it creates CChainState::m_disabled and I could check it's been used to point that a chainstate should no longer be used by validation logic. (perhaps PR description should be updated).

Some logic I could understand:

In e91b216, it first added in CompleteChainstateInitialization

assert(chainman.m_total_coinstip_cache > 0);
assert(chainman.m_total_coinsdb_cache > 0);

Good to check we have enough bytes available for in-memory coins caches and leveldb stuff.

In LoadChainstate I could check (what is mentioned):

Loads the fully validated chainstate
Loads a chain created from a UTXO snapshot, if any exist.

With both ones, it calls MaybeCompleteSnapshotValidation to compare the background chainstate's UTXO hash with the hard-coded (and expected) one. If so, it sets m_disabled=true. And then, we can _re-_initialize the chainstate (InitializeChainstate).

If we could have loadtxoutset (as mentioned by @Sjors), it would be good to review it and see things happen in practice.

achow101 · 2023-03-07T00:45:32Z

Missed this on last review (sorry):

../../../src/validation.cpp: In member function ‘SnapshotCompletionResult ChainstateManager::MaybeCompleteSnapshotValidation(std::function<void(bilingual_str)>)’:
../../../src/validation.cpp:5450:14: error: catching polymorphic type ‘struct StopHashingException’ by value [-Werror=catch-value=]
 5450 |     } catch (StopHashingException) {
      |              ^~~~~~~~~~~~~~~~~~~~

Trigger completion when a background validation chainstate reaches the same height as a UTXO snapshot, and handle cleaning up the chainstate on subsequent startup.

I found this useful during unittest debugging.

for use in the following unittests.

Also adjusts the previous snapshot chainstate init tests to account for the fact that the init process is now attempting to validate and complete background chainstates whose tip is at the snapshot base block. We use a DisconnectTip() hack to preserve the nature of the test.

Include notes about the `chainstate_snapshot` rename as well as updates for the included code.

jamesob · 2023-03-07T21:08:12Z

Missed this on last review (sorry):

Fixed!

achow101 · 2023-03-07T23:36:42Z

ACK 2b373fe

fjahr · 2023-03-12T15:55:06Z

src/validation.cpp

+        return SnapshotCompletionResult::STATS_FAILED;
+    }
+
+    // XXX note that this function is slow and will hold cs_main for potentially minutes.


Just catching up... Not sure if intentional or not but maybe the XXX here was something you still wanted to address but missed @jamesob ?

maflcko

left some nits. Feel free to ignore

maflcko · 2023-03-13T14:59:25Z

src/validation.cpp

+                                   fs::path p_new,
+                                   const fs::filesystem_error& err) {
+        LogPrintf("%s: error renaming file (%s): %s\n",
+                __func__, fs::PathToString(p_old), err.what());


nit in d96c59c: No need to pass func

This is already done by log internally

This will print a useless operator(), twice: [validation.cpp:5712] [operator()] operator(): error renaming file

maflcko · 2023-03-13T15:13:58Z

src/validation.cpp

+        fs::rename(ibd_chainstate_path, tmp_old);
+    } catch (const fs::filesystem_error& e) {
+        rename_failed_abort(ibd_chainstate_path, tmp_old, e);
+        throw;


nit in d96c59c:

Looks like the LOC can be reduced by moving all of this into the lambda instead of repeating the passed paths twice and the throw logic twice.

maflcko · 2023-03-13T15:16:50Z

src/validation.cpp

+        auto dest_str = fs::PathToString(invalid_path);
+
+        LogPrintf("%s: error renaming file '%s' -> '%s': %s\n",
+                __func__, src_str, dest_str, e.what());


nit: Same about __func__

maflcko · 2023-03-13T15:19:50Z

src/validation.cpp

+            "Please report this incident to %s, including how you obtained the snapshot. "
+            "The invalid snapshot chainstate has been left on disk in case it is "
+            "helpful in diagnosing the issue that caused this error."),
+            PACKAGE_NAME, snapshot_tip_height, snapshot_base_height, snapshot_base_height, PACKAGE_BUGREPORT


Same commit: At the risk of people running into this, I wonder if this should use STR_INTERNAL_BUG() so that more info is printed about the version/commit hash they were using.

maflcko · 2023-03-13T15:20:25Z

src/validation.cpp

+    // chainstate past the snapshot base block.
+    if (WITH_LOCK(::cs_main, return m_disabled)) {
+        LogPrintf("m_disabled is set - this chainstate should not be in operation. " /* Continued */
+            "Please report this as a bug. %s\n", PACKAGE_BUGREPORT);


Same. (STR_INTERNAL_BUG(msg))

Summary: This is a backport of [[bitcoin/bitcoin#23154 | core#23154]] with updates from [[bitcoin/bitcoin#25667 | core#25667]] and [[bitcoin/bitcoin#25740 | core#25740]] Depends on D14655 Test Plan: proof-reading Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14656

Summary: and remove m_snapshot_validated. This state can now be inferred by the number of isUsable chainstates. m_disabled is used to signal that a chainstate should no longer be used by validation logic; it is used as a sentinel when background validation completes or if the snapshot chainstate is found to be invalid. isUsable is a convenience method that incorporates m_disabled. This is a partial backport of [[bitcoin/bitcoin#25740 | core#25740]] bitcoin/bitcoin@c29f26b Depends on D14673 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14685

Summary: Moves chainstate initialization into its own function. This is necessary to later support a more readable way of handling background-validation chainstate cleanup during init, since the chainstate initialization functions may need to be repeated after moving leveldb filesystem content around. This commit isn't strictly necessary, but the alternative is to (ab)use the `while` loop in init.cpp with a `continue` on the basis of a specific ChainstateLoadingError return value from LoadChainstate. Not only is this harder to read, but it can't be unittested. The approach here lets us consolidate background-validation cleanup to LoadChainstate, and therefore exercise it within tests. This commit is most easily reviewed with git diff --color-moved=dimmed-zebra --color-moved-ws=ignore-space-change This is a partial backport of [[bitcoin/bitcoin#25740 | core#25740]] bitcoin/bitcoin@f2a4f33 Depends on D14685 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14686

Summary: > add ChainstateManager.GetSnapshot{BaseHeight,BaseBlock}() > > For use in later commits. bitcoin/bitcoin@5ee22cd ---- > add Chainstate::HasCoinsViews() > > Used in subsequent commits. Also cleans up asserts in > coins_views-related convenience methods to be more exact. bitcoin/bitcoin@637a90b ---- > validation: add ChainMan logic for completing UTXO snapshot validation > > Trigger completion when a background validation chainstate reaches the > same height as a UTXO snapshot, and handle cleaning up the chainstate > on subsequent startup. bitcoin/bitcoin@d96c59c ---- > log: add LoadBlockIndex() message for assumedvalid blocks > > I found this useful during unittest debugging. bitcoin/bitcoin@7300ced ---- > refactor: make MempoolMutex() public > > for use in the following unittests. bitcoin/bitcoin@d70919a ---- > test: add snapshot completion unittests > > Also adjusts the previous snapshot chainstate init tests > to account for the fact that the init process is now attempting to > validate and complete background chainstates whose tip is at the > snapshot base block. We use a DisconnectTip() hack to preserve the > nature of the test. bitcoin/bitcoin@87a1108 ---- This completes backport of [[bitcoin/bitcoin#25740 | core#25740]] Depends on D14686 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D14687

This was referenced Jul 30, 2022

rpc, test: Improve getblockstats for unspendables #19888

Merged

refactor: Nuke policy/fees->mempool circular dependencies #17786

Merged

ryanofsky reviewed Aug 1, 2022

View reviewed changes

DrahtBot added the Needs rebase label Aug 3, 2022

jamesob mentioned this pull request Sep 6, 2022

assumeutxo: snapshot initialization #25667

Merged

jamesob force-pushed the 2022-07-au.complete branch from d4fc4ae to ffa698e Compare September 6, 2022 20:17

DrahtBot removed the Needs rebase label Sep 6, 2022

jamesob force-pushed the 2022-07-au.complete branch from ffa698e to ad64bae Compare September 6, 2022 22:14

jamesob force-pushed the 2022-07-au.complete branch from ad64bae to 93e6a23 Compare September 6, 2022 23:06

DrahtBot added the Needs rebase label Sep 13, 2022

jamesob force-pushed the 2022-07-au.complete branch from 93e6a23 to 2e8d153 Compare September 13, 2022 17:53

DrahtBot removed the Needs rebase label Sep 13, 2022

DrahtBot removed the Needs rebase label Feb 22, 2023

ariard approved these changes Feb 22, 2023

View reviewed changes

DrahtBot mentioned this pull request Feb 23, 2023

OP_VAULT draft #26857

Closed

brunoerg approved these changes Mar 3, 2023

View reviewed changes

jamesob added 5 commits March 7, 2023 16:06

validation: add ChainMan logic for completing UTXO snapshot validation

d96c59c

Trigger completion when a background validation chainstate reaches the same height as a UTXO snapshot, and handle cleaning up the chainstate on subsequent startup.

log: add LoadBlockIndex() message for assumedvalid blocks

7300ced

I found this useful during unittest debugging.

refactor: make MempoolMutex() public

d70919a

for use in the following unittests.

docs: update assumeutxo.md

2b373fe

Include notes about the `chainstate_snapshot` rename as well as updates for the included code.

jamesob force-pushed the 2022-07-au.complete branch from a00d0e7 to 2b373fe Compare March 7, 2023 21:07

achow101 merged commit d5e4f9a into bitcoin:master Mar 7, 2023

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 8, 2023

Merge bitcoin#25740: assumeutxo: background validation completion

3a3d85f

fjahr reviewed Mar 12, 2023

View reviewed changes

maflcko reviewed Mar 13, 2023

View reviewed changes

justinmoon mentioned this pull request Mar 16, 2023

March Topics austinbitdevs/austinbitdevs.com#17

Closed

sdaftuar mentioned this pull request Jul 21, 2023

Rework validation logic for assumeutxo #27746

Merged

bitcoin locked and limited conversation to collaborators Mar 12, 2024

assumeutxo: background validation completion #25740

assumeutxo: background validation completion #25740

Uh oh!

Conversation

jamesob commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

jonatack commented Jul 30, 2022

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jamesob commented Sep 6, 2022

Uh oh!

ryanofsky commented Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamesob commented Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonatack commented Feb 22, 2023

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

Sjors commented Feb 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamesob commented Feb 28, 2023

Uh oh!

achow101 commented Feb 28, 2023

Uh oh!

jamesob commented Mar 2, 2023

Uh oh!

brunoerg commented Mar 3, 2023

Uh oh!

brunoerg left a comment

Choose a reason for hiding this comment

Uh oh!

achow101 commented Mar 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamesob commented Mar 7, 2023

Uh oh!

achow101 commented Mar 7, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jamesob commented Jul 29, 2022 •

edited

Loading

DrahtBot commented Jul 29, 2022 •

edited

Loading

ryanofsky commented Sep 6, 2022 •

edited

Loading

jamesob commented Sep 6, 2022 •

edited

Loading

Sjors commented Feb 23, 2023 •

edited

Loading

achow101 commented Mar 7, 2023 •

edited

Loading