-
Notifications
You must be signed in to change notification settings - Fork 37.7k
wallet: Avoid potentially writing incorrect best block locator #29652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29652. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
I couldn't really come up with a test for this bug directly. I did find if I added an assert in --- a/src/wallet/wallet.cpp
+++ b/src/wallet/wallet.cpp
@@ -634,6 +634,14 @@ void CWallet::chainStateFlushed(ChainstateRole role, const CBlockLocator& loc)
if (m_attaching_chain || role == ChainstateRole::BACKGROUND) {
return;
}
+ if (!loc.IsNull()) {
+ LOCK(cs_wallet);
+ int loc_height{-1};
+ if (m_last_block_processed_height >= 0 && chain().findBlock(loc.vHave.front(), FoundBlock().height(loc_height))) {
+ WalletLogPrintf("chainStateFlushed processed height %i locator height %i\n", m_last_block_processed_height, loc_height);
+ assert(m_last_block_processed_height >= loc_height);
+ }
+ }
WalletBatch batch(GetDatabase());
batch.WriteBestBlock(loc);
}
--- a/test/functional/feature_assumeutxo.py
+++ b/test/functional/feature_assumeutxo.py
@@ -52,6 +52,8 @@ COMPLETE_IDX = {'synced': True, 'best_block_height': FINAL_HEIGHT}
class AssumeutxoTest(BitcoinTestFramework):
+ def add_options(self, parser):
+ self.add_wallet_options(parser)
def set_test_params(self):
"""Use the pregenerated, deterministic chain up to height 199."""
@@ -215,6 +217,7 @@ class AssumeutxoTest(BitcoinTestFramework):
self.test_invalid_chainstate_scenarios()
self.log.info(f"Loading snapshot into second node from {dump_output['path']}")
+ n1.createwallet('w', load_on_startup=True)
loaded = n1.loadtxoutset(dump_output['path'])
assert_equal(loaded['coins_loaded'], SNAPSHOT_BASE_HEIGHT)
assert_equal(loaded['base_height'], SNAPSHOT_BASE_HEIGHT) However, this doesn't trigger in the assumeutxo case where the snapshot block This is all to say i couldn't really find a practical bug that could be triggered here or a good test to add to this PR. |
Updated e295db8 -> c5583a3 ( |
virtual CBlockLocator getTipLocator() = 0; | ||
|
||
//! Return a locator that refers to a block in the active chain. | ||
//! If specified block is not in the active chain, return locator for the latest ancestor that is in the chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed while debugging bitcoin#24230 that the ChainStateFlushed notification can be sent with a locator pointing to a different block than the block in the last BlockConnected notification. This is bad because it could cause the wallet to record that it is synced to a later block than it ever processed, and make it potentially fail to scan blocks for relevant transactions if it is unloaded and reloaded. This problem should probably not happen in practice, because normally the locator in the ChainStateFlushed notification does point to the block sent in the last BlockConnected notification. But because of the way ActivateBestChain accumulates block connected notifications, and the fact that FlushStateToDisk is called from many different code paths, this isn't guaranteed. (The new assumeutxo logic also introduces a situation where this is never the case: when the background chain reaches the snapshot height and validates the snapshot, the background chain is flushed before block connected notifications are sent.) In any case, it is better if the wallet writes a locator actually pointing to the last block that it processed instead of writing the locator validation code passes to ChainStateFlushed, so this commit switches to the right locator. A good followup to this change would probably be to drop the ChainStateFlushed locator argument entirely so it is not misused in the future. Indexing code already stopped using this locator argument a long time ago for identifying the best block (in 4368384 from bitcoin#14121), but it is still using the locator argument to log diagnostic warnings.
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
WalletBatch batch(GetDatabase()); | ||
batch.WriteBestBlock(loc); | ||
CBlockLocator loc; | ||
WITH_LOCK(cs_wallet, chain().findBlock(m_last_block_processed, FoundBlock().locator(loc))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this shouldn't happen but.. what about logging an error if findBlock()
returns false?
Also, could lock cs_wallet
only to copy m_last_block_processed
.
@@ -3290,7 +3294,7 @@ bool CWallet::AttachChain(const std::shared_ptr<CWallet>& walletInstance, interf | |||
} | |||
} | |||
walletInstance->m_attaching_chain = false; | |||
walletInstance->chainStateFlushed(ChainstateRole::NORMAL, chain.getTipLocator()); | |||
walletInstance->chainStateFlushed(ChainstateRole::NORMAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing note: you could probably trigger the behavior by adding a sleep call here. When a reorg is processed in-between the ScanForWalletTransactions
call and this chainStateFlushed
call.
Concept ACK I'm now wondering whether the wallet should even be doing anything on |
{ | ||
// Don't update the best block until the chain is attached so that in case of a shutdown, | ||
// the rescan will be restarted at next startup. | ||
if (m_attaching_chain || role == ChainstateRole::BACKGROUND) { | ||
return; | ||
} | ||
WalletBatch batch(GetDatabase()); | ||
batch.WriteBestBlock(loc); | ||
CBlockLocator loc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be useful to check that chainStateFlushed
was called after blockConnected
like was done in
Lines 341 to 353 in 5560741
// This checks that ChainStateFlushed callbacks are received after BlockConnected. The check may fail | |
// immediately after the sync thread catches up and sets m_synced. Consider the case where | |
// there is a reorg and the blocks on the stale branch are in the ValidationInterface queue | |
// backlog even after the sync thread has caught up to the new chain tip. In this unlikely | |
// event, log a warning and let the queue clear. | |
const CBlockIndex* best_block_index = m_best_block_index.load(); | |
if (best_block_index->GetAncestor(locator_tip_index->nHeight) != locator_tip_index) { | |
LogPrintf("%s: WARNING: Locator contains block (hash=%s) not on known best " | |
"chain (tip=%s); not writing index locator\n", | |
__func__, locator_tip_hash.ToString(), | |
best_block_index->GetBlockHash().ToString()); | |
return; | |
} |
CBlockLocator
argument points to the same block as m_last_block_processed
will suffice.
I've implemented this idea in #30221 |
Notifications are sent in order, so it does have some meaning. It means wallet is in-between blockConnected events and synchronized up to some block, and that the node has written some data to disk recently, so the wallet may want write its data to disk, too. I think the only problem with the ChainStateFlushed notification is that because of the fact that |
🚧 At least one of the CI tasks failed. HintsMake sure to run all tests locally, according to the documentation. The failure may happen due to a number of reasons, for example:
Leave a comment here, if you need help tracking down a confusing failure. |
Marking as draft, because this has a silent merge conflict and I don't think I will be able to fix it right away. I still do think this PR is a good idea, but #30221 can be reviewed too if the alternate approach it implements is preferred (updating the wallet every block even it doesn't change). |
🐙 This pull request conflicts with the target branch and needs rebase. |
30a94b1 test, wallet: Remove concurrent writes test (Ava Chow) b44b7c0 wallet: Write best block record on unload (Ava Chow) 876a258 wallet: Remove unnecessary database Close step on shutdown (Ava Chow) 98a1a52 wallet: Remove chainStateFlushed (Ava Chow) 7fd3e1c wallet, bench: Write a bestblock record in WalletMigration (Ava Chow) 6d3a8b1 wallet: Replace chainStateFlushed in loading with SetLastBlockProcessed (Ava Chow) 7bacabb wallet: Update best block record after block dis/connect (Ava Chow) Pull request description: Implements the idea discussed in #29652 (comment) Currently, `m_last_block_processed` and `m_last_block_processed_height` are not guaranteed to match the block locator stored in the wallet, nor do either of those fields actually represent the last block that the wallet is synced up to. This is confusing and unintuitive. This PR changes those last block fields to be updated whenever the wallet makes a change to the db for new transaction state found in new blocks. Whenever a block is received that contains a transaction relevant to the wallet, the last block locator will now be written to disk. Furthermore, every block disconnection will now write an updated locator. To ensure that the locator is relatively recent and loading rescans are fairly quick in the event of unplanned shutdown, it is also now written every 144 blocks (~1 day). Additionally it is now written when the wallet is unloaded so that it is accurate when the wallet is loaded again. Lastly, the `chainstateFlushed` notification in the wallet is changed to be a no-op. The best block locator record is no longer written when `chainstateFlushed` is received from the node since it should already be mostly up to date. ACKs for top commit: rkrux: ACK 30a94b1 mzumsande: Code Review ACK 30a94b1 ryanofsky: Code review ACK 30a94b1. Only changes since last review are using WriteBestBlock method more places and updating comments. Tree-SHA512: 46117541f8aaf13dde57430e813b4bbbd5e146e2632769675803c8e65a82f149a7cc6026489a127d32684b90124bd2b7c28216dbcfa6a47447300e8f3814e029
⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #30221 got merged, I think this one can be closed
re: #29652 (review)
Thanks! #30221 avoids the issue of wallets potentially writing the incorrect locators on flush notifications by just ignoring them and flushing in a less coordinated way. So the main concern of this PR is no longer an issue and it makes sense to close. I still think the changes here dropping the |
@ryanofsky, @furszy, done in #33169. |
…r and incaccurate getActiveChainLocator 2b00030 interfaces, chain, refactor: Remove inaccurate getActiveChainLocator (pablomartin4btc) 110a0f4 interfaces, chain, refactor: Remove unused getTipLocator (pablomartin4btc) Pull request description: Remove `Chain::getTipLocator`, `Chain::GetLocator()`, and `Chain::getActiveChainLocator`: - `Chain::getTipLocator` is no longer used. - `Chain::GetLocator`, replaced its call by `GetLocator()`, which uses `LocatorEntries`, avoiding direct access to the chain itself (change suggested by l0rinc while reviewing this PR to maintain consistency with the overall refactoring). - `Chain::getActiveChainLocator`, whose name was misleading, has functionality redundant with Chain::findBlock. - Additionally, the comment for getActiveChainLocator became inaccurate following changes in commit ed47094 (from PR #25717). This is a [follow-up](#29652 (comment)) to #29652. ACKs for top commit: achow101: ACK 2b00030 furszy: ACK 2b00030 stickies-v: ACK 2b00030 w0xlt: ACK 2b00030 Tree-SHA512: b12ba6a15feeaeec692d69204a6e155e3af43edfac25597dabf14cacca1e4a2152574816e58dc544f39043c5721f5e707acf544f4541d3b9c0f8c0c40069215e
This PR updates the
CWallet::chainStateFlushed
method to write them_last_block_processed
locator to the database as the wallet's best block instead of the chain tip locator.Saving the last block processed locator as the best block is less fragile than saving
chainStateFlushed
locator as best block.Right now wallet code relies on
blockConnected
notifications being sent beforechainStateFlushed
notifications, and on thechainStateFlushed
locator pointing to the last connected block. But this is a fragile assumption that can be easily broken byActivateBestChain
/ConnectTrace
changes, and in fact is currently broken (since d96c59c from #25740) when the assumeutxo snapshot block is validated. In that case the background validationchainStateFlushed
notification is sent before theblockConnected
notification so writing the locator as the best block would record a best block that was never processed. This specific change does not cause a problem for the wallet, because the wallet ignores events from the background validation chainstate. But nothing prevents similar out-of-orderchainStateFlushed
notifications from being sent in the future, so it is safer for the wallet to not rely on the chain tip sent inchainStateFlushed
notifications.A good followup to this change would probably be to drop the
ChainStateFlushed
locator argument entirely so it is not misused in the future. Indexing code already stopped using this locator argument a long time ago for identifying the best block (in 4368384 from #14121), though it is currently using the locator argument to log diagnostic warnings.