Skip to content

Conversation

fjahr
Copy link
Contributor

@fjahr fjahr commented May 20, 2025

This topic has been motivated by my work on batch validation and a related conversation just happened here: #32467 (comment)

CheckInputScripts currently only does what its name implies if there is no multithreading usage for validation. If there are worker threads available the function instead only creates the checks and puts them into a vector. Aside from some shared pre-checks the multithreaded code path is much simpler compared to the rest. This dual use makes the function hard to grasp and its naming confusing. So this PR is splitting the single thread case from the multithread. This results in more LOC but the result is much clearer.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 20, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32575.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK theuni, TheCharlatan, stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #32998 (Bump SCRIPT_VERIFY flags to 64 bit by ajtowns)
  • #32317 (kernel: Separate UTXO set access from validation functions by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@theuni
Copy link
Member

theuni commented May 20, 2025

Concept ACK. Anything to clean this up :)

See also #32317 which moves some of this functionality around similarly.

@fjahr fjahr force-pushed the 2025-05-CheckInputScript-split branch from 4f355d4 to 4262b75 Compare May 20, 2025 21:43
@fjahr fjahr marked this pull request as ready for review May 20, 2025 21:48
@fjahr
Copy link
Contributor Author

fjahr commented May 20, 2025

Cleaned up the comments so this should be ready for more detailed feedback, but expect a rebase once #32467 is merged.

bool cacheFullScriptStore, PrecomputedTransactionData& txdata,
ValidationCache& validation_cache) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
{
if (tx.IsCoinBase()) return {};
Copy link
Contributor Author

@fjahr fjahr May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been contemplating kicking this out (here and in CheckInputScripts) or even turning this into an assert because the callers already handle this everywhere. On the other hand it conceptually makes sense to return early if we know there is nothing to check because we don't expect any inputs. So I am still undecided on this one and happy to hear opinions.

@TheCharlatan
Copy link
Contributor

Concept ACK

@fjahr fjahr changed the title RFC: refactor: Split multithreaded case out of CheckInputScripts refactor: Split multithreaded case out of CheckInputScripts May 20, 2025
@stickies-v
Copy link
Contributor

Concept ACK

@theuni
Copy link
Member

theuni commented May 22, 2025

Ready for rebase.

@fjahr fjahr force-pushed the 2025-05-CheckInputScript-split branch from 4262b75 to 5dfd10c Compare May 22, 2025 20:07
@fjahr
Copy link
Contributor Author

fjahr commented May 22, 2025

Ready for rebase.

Done, ready for review :)

state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
tx_state.GetRejectReason(), tx_state.GetDebugMessage());
break;
if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
Copy link
Member

@theuni theuni May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should go a little further. At this point, CheckInputScripts is essentially meant to be a mempool function and PrepareInputScriptChecks is for block validation, except for the weird edge-case when we're single-threaded.

I think I'd rather just see this go all the way and make the split purposeful. Something like:

diff --git a/src/validation.cpp b/src/validation.cpp
index 1c81620889f..d9a447b460f 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -2723,19 +2723,18 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
         {
             bool fCacheResults = fJustCheck; /* Don't cache results if we're actually connecting blocks (still consult the cache, though) */
             TxValidationState tx_state;
-            // If parallel script checking is possible (worker threads are available) the checks are appended to a
-            // vector without running them. The vector is then added to control which runs the checks asynchronously.
-            // Otherwise, CheckInputScripts runs the checks on a single thread before returning.
-            if (control) {
-                std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
-                control->Add(std::move(vChecks));
-            } else {
-                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
-                    // Any transaction validation failure in ConnectBlock is a block consensus failure
-                    state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
-                                  tx_state.GetRejectReason(), tx_state.GetDebugMessage());
-                    break;
+            // If parallel script checking is possible (worker threads are available), they are added to control which
+            // which runs the checks asynchronously. Otherwise they are run here directly.
+            std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
+            if (control) control->Add(std::move(vChecks));
+            else {
+                for (auto& check : vChecks) {
+                    if (const auto& check_result = check()) {
+                        state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, strprintf("mandatory-script-verify-flag-failed (%s)", ScriptErrorString(check_result->first)), check_result->second);
+                        break;
+                    }
                 }
+                if (!state.IsValid()) break;
             }
         }

An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.

Copy link
Contributor

@mzumsande mzumsande May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea.

I think it would be a small change in behaviour, because if I read this line right it will add entries to the cache if
1)pvChecks is not set (single-threaded)
2) cacheFullScriptStore is set (i.e. if ConnectBlock() was called with fJustCheck=True, so currently only from TestBlockValidity).
I wonder if the current behavior is intended - it seems very strange to me that mining rpcs would add entries to the script cache, but only in the -par=1 case.

Copy link
Member

@theuni theuni Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzumsande I think it's very unlikely that that behavior is intended :)

Edit: likely -> unlikely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whooops, *Very unlikely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the suggestion here.


EnsureTxData(tx, inputs, txdata);

std::vector<CScriptCheck> pvChecks{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a rough idea, but did you consider the alternative approach of doing the pvChecks population not here (but in ConnectBlock() or a separate function), so that CheckInputScripts could also call PrepareInputScriptChecks and there would be no duplication of the coinbase check, cache check and TxData init?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented this at the cost of duplicating the cache entry calculation. I think it should be fine but I will try to run some benchmarks.

@fjahr fjahr force-pushed the 2025-05-CheckInputScript-split branch from 5dfd10c to edcc004 Compare June 19, 2025 22:00
@fjahr
Copy link
Contributor Author

fjahr commented Jun 19, 2025

An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.

Yeah, that does work basically out of the box. I tested it with this minimal change on top of the changes here: fjahr@320b9a9 However I don't favor that approach at the moment since there seems to be a lot of overhead when using checkqueue with a single thread and a lot of docs would need to be changed accordingly as well. Just logically you wouldn't think about using a queue when you just have a single thread, so while it would definitely simplify the code in ConnectBlock, I am not sure it's a bigger win than the other approaches discussed here, given that it may cause irritation in other places and a performance penalty. I am happy to be convinced of this approach though if I overlooked something.

I have now pushed here an updated version that addresses both the suggestions from @theuni and @mzumsande .

@fjahr fjahr force-pushed the 2025-05-CheckInputScript-split branch from 6667ea1 to 4fd1775 Compare August 13, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants