-
Notifications
You must be signed in to change notification settings - Fork 37.7k
assumeutxo: indexing changes #24006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assumeutxo: indexing changes #24006
Conversation
For use in later commits.
To be used in subsequent commits.
Adds BackgroundBlockConnected to the validationinterface and uses it in index maintenance. Ensures that index building will work when part of the chain is validated asynchronously in the background by a second chainstate. This changeset removes the guarantee that indexes will be built sequentially, as none of the current indexes require this and it is no longer easy to offer this guarantee when multiple chainstates are in use. Within BaseIndex, we only update `m_best_block_index` (which is essentially the progress marker for how far along the indexing process is) for chains which we can be certain that all blocks under a given block on that chain have had indexing performed on them. In other words, we will not update `m_best_block_index` during a BlockConnected event that comes from an active snapshot chain, i.e. a chain for which some of the blocks underneath Tip() are not indexed. Once background validation is completed and no background chainstates are in use, the indexer will happily use BlockConnected events from the active chain to update `m_best_block_index` as usual. No blocks from the active chainstate will have been missed for indexing (despite not previously updating m_best_block_index), so it is safe to perform this transition without any extra behavior. Some unnecessary logic from BaseIndex::ChainStateFlushed() is removed, since the locator can no longer be assumed to be an ancestor of the best_block_index (since the "best block" may be on a background chainstate and the locator may have come from an assumed-valid tip well ahead of it).
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changeset removes the guarantee that indexes will be built
sequentially, as none of the current indexes require this and it is no
longer easy to offer this guarantee when multiple chainstates are in
use.
I'm confused. IIUC when a new index starts building while IBD is in progress it will:
- While only synching from assume valid to the tip: do nothing
- While syncing the background chain state: index, from genesis up
- When background chain state is finish: index assumed valid blocks
If so, then it's still building sequentially. Or is it indexing assumed valid blocks already in (1) and (2)?
Also, doesn't MuHash in coinstatsindex require the index to be in sequence, because the hash for each block is an increment of that of the previous block? Or do we store the increment and infer the value for each block on the fly? cc @fjahr
Regarding ValidationInterface callbacks for active chain, do you have any hint for how to reason about what the consumers of these callbacks expect?
@@ -822,6 +822,10 @@ class ChainstateManager | |||
CBlockIndex** ppindex) EXCLUSIVE_LOCKS_REQUIRED(cs_main); | |||
friend CChainState; | |||
|
|||
// Returns nullptr if no snapshot ahs been loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ahs
-> has
//! In other words, give us the chainstate for which we can reasonably expect | ||
//! that all blocks beneath the tip have been indexed. In practice this means | ||
//! when using an assumed-valid chainstate based upon a snapshot, return only the | ||
//! fully validated chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit cryptically worded, because "fully validated chain" could be read as "the background chain, but only once it's fully validated".
It works like this in the index but since the MuHash is just a representation of a UTXO set and does not commit to a previous state you can "jump in" at any point in the chain and calculate the MuHash at any height by looking only at the UTXO set (inefficient because the full set has to be processed) and then from this point on work with the deltas (=blocks) as usual (efficient). Because of this it is also possible to calculate the MuHash at the tip without the index, it just takes a while. I haven't looked at the code in detail yet but just skimming I think the code for this starting of the coinstatsindex at any height is not included yet, right? But since assumeutxo supplies the UTXO set and its corresponding height this should be possible to build this in a follow-up. But I agree with Sjors' question on the conceptual level: do we really need the indices to sync already right away on the blocks between the au height and tip or is it not maybe simpler to have the indices only sync with the background chain? Syncing the blocks between au and tip first we could serve blockfilters at the tip right away, which is nice, but if there is ever an issue with an incorrect snapshot then we make it other people's problem too because we are serving incorrect blockfilters. |
The way this works as-written is that if the user starts with an empty datadir and then proceeds to load a snapshot, indexes will be built on the basis of incoming (Background)BlockConnected events that will come from both chainstates (due to the behavior documented in the unmerged #22485) - hence the lack of ordering. But now that we're discussing it, maybe it's preferable to just defer index building until we're done with the assumed-valid chain...
Good catch! Thought I'd checked that but apparently not: https://github.com/jamesob/bitcoin/blob/e84248814296b82d926f5018dd62d917ddefe626/src/index/coinstatsindex.cpp#L118-L121 A few options that come to mind:
|
I think we can introduce |
Closing this while I work on a simpler implementation that preserves sequential indexing at the cost of deferring indexing on the assumed-valid chainstate. |
This is part of the assumeutxo project (parent PR: #15606)
This PR includes the necessary changes for indexing to work with the operation of a background validation chainstate. In short, it
More details below.
Adds BackgroundBlockConnected to the validationinterface and uses it in
index maintenance. Ensures that index building will work when part of
the chain is validated asynchronously in the background by a second
chainstate.
This changeset removes the guarantee that indexes will be built
sequentially, as none of the current indexes require this and it is no
longer easy to offer this guarantee when multiple chainstates are in
use.
Within BaseIndex, we only update
m_best_block_index
(which isessentially the progress marker for how far along the indexing process
is) for chains which we can be certain that all blocks under a given
block on that chain have had indexing performed on them. In other
words, we will not update
m_best_block_index
during a BlockConnectedevent that comes from an active snapshot chain, i.e. a chain for which
some of the blocks underneath Tip() are not indexed.
Once background validation is completed and no background chainstates
are in use, the indexer will happily use BlockConnected events from the
active chain to update
m_best_block_index
as usual. No blocks from theactive chainstate will have been missed for indexing (despite not
previously updating m_best_block_index), so it is safe to perform this
transition without any extra behavior.
Some unnecessary logic from BaseIndex::ChainStateFlushed() is removed,
since the locator can no longer be assumed to be an ancestor of the
best_block_index (since the "best block" may be on a background chainstate
and the locator may have come from an assumed-valid tip well ahead of
it).