-
Notifications
You must be signed in to change notification settings - Fork 37.7k
kernel: Separate UTXO set access from validation functions #32317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32317. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
7492904
to
c1b36c4
Compare
c1b36c4
to
76a8f22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, as this PR increases the decoupling of components.
nit: The name SpendBlock
may not be clear (unless this concept is already known and I'm missing the point).
Maybe BIP30Validation
, something like that, would be clearer.
It is doing more than BIP-30 validation, so I don't think that would accurate either. I thought the name was pretty clear for what it does, but I'd be open to other suggestions. |
bool CheckInputScripts(const CTransaction& tx, TxValidationState& state, | ||
const CCoinsViewCache& inputs, unsigned int flags, bool cacheSigStore, | ||
std::vector<CTxOut>&& spent_outputs, unsigned int flags, bool cacheSigStore, | ||
bool cacheFullScriptStore, PrecomputedTransactionData& txdata, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we keep passing the spent_outputs
instead of removing it (to simplify the API) and requiring callers to initialize txdata
before calling CheckInputScripts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My read is that the call to txdata.Init
is delayed to after checking the script execution cache for performance reasons. Calling Init
does some hashing, that would not be required if it is already in the cache. Moving the spent_outputs
vector construction is already a potential performance downside for potential cache hits, though I think that is more acceptable in the grand scheme of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it makes sense. Thanks for the insights.
Updated 76a8f22 -> 16a695f (spendblock_0 -> spendblock_1, compare)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Reducing the coupling between validation functions and the UTXO set makes the validation functions more usable, and more testable. The approach taken here seems sensible, but I'm going to think about alternatives a bit more.
for (const auto& txin : tx.vin) { | ||
const COutPoint& prevout = txin.prevout; | ||
const Coin& coin = AccessCoin(prevout); | ||
assert(!coin.IsSpent()); | ||
spent_outputs.emplace_back(coin.out); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use AccessCoins
here?
for (const auto& txin : tx.vin) { | |
const COutPoint& prevout = txin.prevout; | |
const Coin& coin = AccessCoin(prevout); | |
assert(!coin.IsSpent()); | |
spent_outputs.emplace_back(coin.out); | |
} | |
for (const Coin& coin : AccessCoins(tx)) { | |
assert(!coin.IsSpent()); | |
spent_outputs.emplace_back(coin.out); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inconsistent naming between GetUnspentOutputs
and spent_outputs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also use AccessCoin
, but I think accessing it directly is a bit more efficient, because we don't allocate another vector. Do you have a better suggestion for the naming there? It is all a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't considered the extra vector allocation, yeah I'm okay leaving this as-is then.
I did nerd-snipe myself into looking into having AccessCoins
return a view would be quite elegant in some ways, allowing lazy evaluation, but seems like it would require quite a bit of overhaul and might have other drawbacks I've not considered yet.
auto AccessCoins(const CTransaction& tx) const
{
return tx.vin | std::views::transform([this](const CTxIn& input) -> const Coin& {
return this->AccessCoin(input.prevout);
});
}
std::vector<CTxOut> spent_outputs{active_coins_tip.GetUnspentOutputs(tx)}; | ||
txdata.Init(tx, std::move(spent_outputs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a way, it's a bit silly to add a parameter to the CheckInputScripts
signature to then just move it into another parameter (txdata
). An alternative approach could be to let the caller handle this, and assert txdata.m_spent_outputs_ready
, but that could raise runtime assertion errors instead of catching them at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @TheCharlatan noted here, delaying txdata.Init
could be on purpose for the potential cache hit here. I wonder if there is a cleaner version for achieving the same goal though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
I had a few more comments (sorry for the scattered feedback; didn't have the chance to review everything earlier).
PSA: Bitcoin Core PR Review Club will cover this PR in its next meeting at 2025-05-14 at 17:00 UTC. See https://bitcoincore.reviews/32317 for notes, questions, and instructions on how to join. |
Thank you for the reviews @stringintech and @stickies-v! 16a695f -> e8ac6d2 (spendblock_1 -> spendblock_2, compare)
|
Rebased 6e90a0e -> 1f96d1c (spendblock_7 -> spendblock_8, compare)
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
Updated 1f96d1c -> 10c3a06 (spendblock_8 -> spendblock_9, compare)
|
10c3a06
to
3843a15
Compare
Rebased 10c3a06 -> 3843a15 (spendblock_9 -> spendblock_10, compare)
|
Move the responsibility of retrieving coins from GetP2SHSigOpCount to its caller. This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins. Define two explicit template specializations for both a span of references to coins and a span of coins. This allows using it for both Coin entries referenced from the CCoinsViewCache, and from contiguous memory, like the vector in CBlockUndo.
Move the responsibility of retrieving coins from GetTransactionSigOpCost to its caller. This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins. Define two explicit template specializations for both a span of references to coins and a span of coins. This allows using it for both Coin entries referenced from the CCoinsViewCache, and from contiguous memory, like the vector in CBlockUndo.
Move the responsibility of retrieving coins from CheckTxInputs to its caller. The missing inputs check will be moved in an upcoming commit. This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins. Define two explicit template specializations for both a span of references to coins and a span of coins. This allows using it for both Coin entries referenced from the CCoinsViewCache, and from contiguous memory, like the vector in CBlockUndo.
…InputScripts Move the responsibility of retrieving coins from the CCoinsViewCache in CheckInputScripts to its caller. Add a helper method in CCoinsViewCache to collect all Coin's outputs spent by a transaction's inputs. Callers of CCoinsViewCache are updated to either pre-fetch the spent outputs, or pass in an empty vector if the precomputed transaction data has already been initialized with the required outputs. This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins.
Move the BIP30 checks from ConnectBlock to a new SpendBlock method. This is a move-only change, more content to SpendBlock is added in the next commits. The goal is to move logic requiring access to the CCoinsViewCache out of ConnectBlock and to the new SpendBlock method. SpendBlock will in future handle all UTXO set interactions that previously took place in ConnectBlock. Callers of ConnectBlock now also need to call SpendBlock before. This is enforced in a future commit by adding a CBlockUndo argument that needs to be passed to both. This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins.
This is a part of a series of commits for removing access to the CCoinsViewCache in consensus verification functions. The goal is to allow calling verification functions with pre-fetched, or a user-defined set of coins.
Move the remaining UTXO-related operations from ConnectBlock to SpendBlock. This includes moving the existence check, the UpdateCoins call, and CBlockUndo generation. ConnectBlock now takes a pre-populated CBlockUndo as an argument and no longer accesses or manipulates the UTXO set.
3843a15
to
5537ac0
Compare
Rebased 3843a15 -> 5537ac0 (spendblock_10 -> spendblock_11, compare)
|
The current code makes it hard to call functions for checking the consensus-correctness of a block without having a full UTXO set at hand. This makes implementing features such as Utreexo and non-assumevalid swiftsync, where maintaining a full UTXO set defeats their purpose, through the kernel API difficult. Checks like the coinbase subsidy, or block sigops are only available through
ConnectBlock
, which requires a complete coins cache.Solve this by moving accessing coins and the responsibility of spending them out of
ConnectBlock
and into a new separate function calledSpendBlock
. Pass a block's spent coins through the existingCBlockUndo
data structure toConnectBlock
.While the change is largely a refactor, some smaller behaviour changes were unavoidable. These should be highlighted in the commit messages. The last commit contains the most important part of the coins logic move, the commits before attempt to prepare it piecemeal.
This pull request was benchmarked on benchcoin, which showed no performance regression. While discussing this pull request it was noted that pre-fetching and eliminating the need for some additional map lookups might improve performance a bit, but this does not make a measurable difference in the grand scheme of things and was not the motivation for this pull request.
In future changes, ConnectBlock could be made even more self contained, accessing the coins from the internal map could be further optimized by consolidating existence checking and spending, and the function eventually exposed in an external kernel API.