kernel: De-globalize validation caches #30141

TheCharlatan · 2024-05-19T14:17:42Z

The validation caches are currently setup independently from where the rest of the validation code is initialized. This makes their ownership semantics unclear. There is also no clear enforcement on when and in what order they need to be initialized. The caches are always initialized in the BasicTestingSetup although a number of tests don't actually need them.

Solve this by moving the caches from global scope into the ChainstateManager class. This simplifies the usage of the kernel library by no longer requiring manual setup of the caches prior to using the ChainstateManager. Tests that need to access the caches can instantiate them independently.

This pull request is part of the libbitcoinkernel project.

DrahtBot · 2024-05-19T14:17:45Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	stickies-v, glozow, ryanofsky
Concept ACK	ajtowns
Stale ACK	maflcko, theuni

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30157 (Fee Estimation via Fee rate Forecasters by ismaelsadeeq)
#29745 (bench: Adds a benchmark for CheckInputScripts by kevkevinpal)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ajtowns

Concept ACK

src/validation.h

theuni

Concept ACK. Nice.

src/script/sigcache.cpp

TheCharlatan · 2024-05-21T08:47:10Z

Updated 79c9c55 -> 4a1df97 (noGlobalScriptCache_0 -> noGlobalScriptCache_1, compare)

Addressed @theuni's comment, made CSignatureCache more RAII styled.
Addressed @ajtowns's comment, applied his suggestion of dropping the error handling if the cache size exceeds the uint32_t maximum value.

src/node/chainstatemanager_args.cpp

TheCharlatan · 2024-05-21T21:20:58Z

Updated 4a1df97 -> 5b1576b (noGlobalScriptCache_1 -> noGlobalScriptCache_2, compare)

Addressed @theuni's comment, fixed outdated comment.

TheCharlatan · 2024-06-18T08:40:29Z

Rebased 5b1576b -> 63923c8 (noGlobalScriptCache_2 -> noGlobalScriptCache_3, compare)

Fixed conflict with Encapsulate warnings in generalized node::Warnings and remove globals #30058

stickies-v

Approach ACK 63923c8

src/cuckoocache.h

src/validation.h

TheCharlatan · 2024-06-19T08:57:11Z

Thank you for the review @stickies-v,

63923c8 -> 6ad4aa8 (noGlobalScriptCache_3 -> noGlobalScriptCache_4, compare)

Addressed @stickies-v's comment, deleting copy and move constructor of the newly introduced ValidationCache class.
Addressed @stickies-v's comment, applying the suggestion of making the pre-initialized hasher private and adding a getter for it.

src/test/txvalidationcache_tests.cpp

ryanofsky

Code review ACK c1d6e52. Nice that this PR is +202 −261 lines and cleans a number of things up in addition to removing the globals.

I noticed #10754 while reviewing this, which suggests the idea of using a shared cache and could be something to follow up on. It also provides more background information about the caches.

re: #30141 (comment)

This was a regression I introduced after the first round of review

That's interesting, I guess it shows the dangers of reviewing with range-diff, because it's hard to know if a review suggestion was fully implemented or there may be some old code left behind.

maflcko · 2024-07-03T09:27:11Z

That's interesting, I guess it shows the dangers of reviewing with range-diff, because it's hard to know if a review suggestion was fully implemented or there may be some old code left behind.

I don't think this is related to range-diff. It is simply a typo, which can be missed by a human reviewer regardless of how the code or diff is displayed. (I didn't use range-diff and missed it, fwiw)

ryanofsky · 2024-07-03T10:39:26Z

I don't think this is related to range-diff. It is simply a typo, which can be missed by a human reviewer regardless of how the code or diff is displayed. (I didn't use range-diff and missed it, fwiw)

I'll often ack a PR and make a suggestion like you can "replace foo/2 with bar". Then if the PR is updated, I will check the range diff and re-ack with "the only thing that changed since last review is replacing foo/2 with bar" without looking at the complete diff again. This is potentially dangerous because it doesn't the verify suggestion was fully implemented, which depending on the suggestion might introduce bugs. In this case there was a pretty obvious bug that early reviewers who looked at multiple versions of the PR missed, and a later reviewer who was reviewing at the PR for the first time pointed out. So relying too much on range-diff might have caused this, and it's a good reminder (to myself at least) to be careful when using it.

glozow

ACK c1d6e52

glozow · 2024-07-03T11:24:46Z

src/validation.cpp

@@ -2145,10 +2147,10 @@ bool CheckInputScripts(const CTransaction& tx, TxValidationState& state,
    // properly commits to the scriptPubKey in the inputs view of that
    // transaction).
    uint256 hashCacheEntry;
-    CSHA256 hasher = g_scriptExecutionCacheHasher;
+    CSHA256 hasher = validation_cache.ScriptExecutionCacheHasher();
    hasher.Write(UCharCast(tx.GetWitnessHash().begin()), 32).Write((unsigned char*)&flags, sizeof(flags)).Finalize(hashCacheEntry.begin());
    AssertLockHeld(cs_main); //TODO: Remove this requirement by making CuckooCache not require external locks


Question tangential to this PR: This seems to mean the script execution cache needs something to synchronize access, and it is currently implicitly using cs_main? I don't see any threads other than message handler that access it though.

I guess the script execution cache could be wrapped just like the signature cache, which has an internal mutex.

src/script/sigcache.h

ryanofsky · 2024-07-03T13:24:46Z

I wonder if there is a DrahtBot parsing bug? It seems to be parsing #30141 (review) as a concept ack instead of a plain ack, and it requested another review from me despite the ack.

EDIT: This happened because drahtbot was not parsing the linked comment. I had written another comment with the word A-C-K after my review, and drahtbot seems to looks at the latest comment with the word A-C-K, ignoring earlier comments, as described maflcko/DrahtBot#33

maflcko · 2024-07-03T13:59:03Z

DrahtBot is open-source, so pull requests and bug reports are welcome. But I am not sure if the additional code is worth it for the rare case where someone write a comment containing A CK after they sent a proper review A CK commit_hash. If you care about the summary comment being correct, you can:

Edit your comments after your review to not contain A CK, or
Resubmit your review

Change LogInstance() function to no longer allocate (and leak) a BCLog::Logger instance. Instead allow kernel applications to initialize their own logging instances that can be returned by LogInstance(). This change is somewhat useful by itself, but more useful in combination with bitcoin#30342. By itself, it gives kernel applications control over when the Logger is created and destroyed and allows new instances to replace old ones. In combination with bitcoin#30342 it allows multiple log instances to exist at the same time, and different output to be sent to different instances. This commit is built on top of bitcoin#30141 since it simplifies the implementation somewhat.

stickies-v

ACK c1d6e52 . Left a few nits but would suggest not to address unless force push is necessary, although of course I'll quickly re-review if you do.

I guess it shows the dangers of reviewing with range-diff

FYI In my case, too, this wasn't a review failure because of range-diff, as I've done multiple full re-reviews that included the affected line. I think I was blind to it because of the similarity to the -maxsigcache halving that is expected.

Thanks a lot for your keen observation and catching this mistake @ryanofsky.

src/init.cpp

src/test/txvalidationcache_tests.cpp

stickies-v · 2024-07-04T16:01:38Z

src/validation.cpp

    g_scriptExecutionCacheHasher.Write(nonce.begin(), 32);
    g_scriptExecutionCacheHasher.Write(nonce.begin(), 32);


in 06e87c2: modifying this global inside a constructor seems quite footgunny. The footgun is removed in the next commit 0d41630, but I think the robust thing to do would be to squash the next commit to avoid cherry-pick accidents. I don't practically see this leading to issues, so I'm fine keeping it as is too to minimize the range-diff, so it's probably more of a review note.

Mmh, the footgun here seems less bad than calling InitScriptExecutionCache twice, no?

Perhaps slightly so, because here at least setup_bytes() isn't called again.

My main point was that with InitScriptExecutionCache it's clearly an initialization function that's affecting global state. On the other hand, one shouldn't have to expect that simply constructing an object is going to invalidate all other objects of the same type, that's unintuitive and rather opaque I think.

But we can leave this as is, it's resolved in the next commit after all.

src/script/sigcache.cpp

Instead clamp it to uint32::max if it exceeds it. Co-authored-by: Anthony Towns <aj@erisian.com.au>

Move its ownership to the ChainstateManager class. Next to simplifying usage of the kernel library by no longer requiring manual setup of the cache prior to using validation code, it also slims down the amount of memory allocated by BasicTestingSetup.

Move it to the ChainstateManager class.

This is done in preparation for the following commit. Also rename it to SignatureCache.

Move its ownership to the ChainstateManager class. Next to simplifying usage of the kernel library by no longer requiring manual setup of the cache prior to using validation code, it also slims down the amount of memory allocated by BasicTestingSetup. Use this opportunity to make SignatureCache RAII styled Co-authored-by: Ryan Ofsky <ryan@ofsky.org>

TheCharlatan · 2024-07-05T07:10:36Z

Updated c1d6e52 -> 606a7ab (noGlobalScriptCache_13 -> noGlobalScriptCache_14, compare)

Addressed @glozow's comment, added a static_assert to check that the two default cache sizes add up correctly.
Addressed @stickies-v's comment_1 and comment_2, removing unnecessary whitespace and formatting change.
Addressed @stickies-v's comment, removed unneeded include.
Addressed @stickies-v's comment, aligned ampersand operator to type name.

dergoegge · 2024-07-05T11:17:01Z

src/test/fuzz/script_sigcache.cpp

@@ -36,7 +39,7 @@ FUZZ_TARGET(script_sigcache, .init = initialize_script_sigcache)
    const CAmount amount = ConsumeMoney(fuzzed_data_provider);
    const bool store = fuzzed_data_provider.ConsumeBool();
    PrecomputedTransactionData tx_data;
-    CachingTransactionSignatureChecker caching_transaction_signature_checker{mutable_transaction ? &tx : nullptr, n_in, amount, store, tx_data};
+    CachingTransactionSignatureChecker caching_transaction_signature_checker{mutable_transaction ? &tx : nullptr, n_in, amount, store, *g_signature_cache, tx_data};


Creating the sig cache every iteration (instead of the global) would be better

Should we ideally check both?

You mean test a global and local cache? We shouldn't fuzz globals (unless we can reset their state).

The reason I commented is that globals make fuzz tests non-deterministic. Let's say the harness is called (in-process) with input A, B and then C. It crashes on input C (which will be the input that the fuzz engine reports to you) but it only crashed because of the data stored in the global cache from input A and B. Giving the harness just input C won't make it crash, because it depends on the state from the previous iterations.

The way fuzz engines gather coverage information is also thrown off by non-determinism like this. They are designed under the assumption that the body of the fuzz harness is a pure function.

stickies-v

re-ACK 606a7ab

stickies-v · 2024-07-05T17:29:20Z

src/script/sigcache.h

 // DoS prevention: limit cache size to 32MiB (over 1000000 entries on 64-bit
 // systems). Due to how we count cache size, actual memory usage is slightly
 // more (~32.25 MiB)
-static constexpr size_t DEFAULT_MAX_SIG_CACHE_BYTES{32 << 20};
+static constexpr size_t DEFAULT_VALIDATION_CACHE_BYTES{32 << 20};


nit: having DEFAULT_VALIDATION_CACHE_BYTES and DEFAULT_SCRIPT_EXECUTION_CACHE_BYTES defined in sigcache.h is not ideal. One alternative is to rename script/sigcache.h to script/cache.h and move ValidationCache in it, but that touches quite a few lines so I'm not convinced that's the best thing to do for this PR to make progress.

glozow

reACK 606a7ab

ryanofsky

Code review ACK 606a7ab. Just small formatting, include, and static_assert changes since last review.

I think it would be great to follow up on dergoegge's comment about fuzzing
#30141 (comment). It seems like it could make fuzzing output much more useful. I don't think it is critical to do it as part of this PR though, since the fuzz test currently relies on global state and this PR isn't changing that.

hebasto · 2024-07-14T06:53:33Z

Ported to the CMake-based build system in hebasto#264.

DrahtBot added the Validation label May 19, 2024

TheCharlatan marked this pull request as ready for review May 19, 2024 17:36

DrahtBot mentioned this pull request May 20, 2024

scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] #29641

Draft

ajtowns reviewed May 20, 2024

View reviewed changes

src/validation.h Outdated Show resolved Hide resolved

theuni reviewed May 20, 2024

View reviewed changes

src/script/sigcache.cpp Outdated Show resolved Hide resolved

TheCharlatan force-pushed the noGlobalScriptCache branch from 79c9c55 to 4a1df97 Compare May 21, 2024 08:36

theuni reviewed May 21, 2024

View reviewed changes

src/node/chainstatemanager_args.cpp Outdated Show resolved Hide resolved

TheCharlatan force-pushed the noGlobalScriptCache branch from 4a1df97 to 5b1576b Compare May 21, 2024 21:20

DrahtBot added CI failed and removed CI failed labels May 22, 2024

This was referenced May 23, 2024

Fee Estimation via Fee rate Forecasters #30157

Draft

Encapsulate warnings in generalized node::Warnings and remove globals #30058

Merged

bench: Adds a benchmark for CheckInputScripts #29745

Closed

This was referenced May 30, 2024

Stratum v2 Template Provider (take 3) #29432

Closed

refactor: Improve assumeutxo state representation #30214

Open

DrahtBot mentioned this pull request Jun 6, 2024

Cluster mempool implementation #28676

Open

8 tasks

DrahtBot added the Needs rebase label Jun 17, 2024

TheCharlatan force-pushed the noGlobalScriptCache branch from 5b1576b to 63923c8 Compare June 18, 2024 08:40

DrahtBot removed the Needs rebase label Jun 18, 2024

stickies-v reviewed Jun 18, 2024

View reviewed changes

src/cuckoocache.h Outdated Show resolved Hide resolved

src/validation.h Outdated Show resolved Hide resolved

src/validation.h Show resolved Hide resolved

DrahtBot mentioned this pull request Jun 19, 2024

kernel, refactor: return error status on all fatal errors #29700

Draft

TheCharlatan force-pushed the noGlobalScriptCache branch from 63923c8 to 6ad4aa8 Compare June 19, 2024 08:57

maflcko reviewed Jun 19, 2024

View reviewed changes

src/test/txvalidationcache_tests.cpp Outdated Show resolved Hide resolved

TheCharlatan force-pushed the noGlobalScriptCache branch from 6ad4aa8 to fa660a2 Compare June 19, 2024 15:15

ryanofsky approved these changes Jul 2, 2024

View reviewed changes

DrahtBot requested a review from theuni July 2, 2024 21:26

DrahtBot mentioned this pull request Jul 2, 2024

build: Introduce internal kernel library #28690

Open

glozow reviewed Jul 3, 2024

View reviewed changes

DrahtBot requested a review from ryanofsky July 3, 2024 11:40

maflcko mentioned this pull request Jul 3, 2024

Concept (N)ACK issues maflcko/DrahtBot#33

Open

stickies-v approved these changes Jul 4, 2024

View reviewed changes

TheCharlatan and others added 5 commits July 4, 2024 22:35

validation: Don't error if maxsigcachesize exceeds uint32::max

ab14d1d

Instead clamp it to uint32::max if it exceeds it. Co-authored-by: Anthony Towns <aj@erisian.com.au>

kernel: De-globalize script execution cache hasher

021d388

Move it to the ChainstateManager class.

Expose CSignatureCache class in header

66d74bf

This is done in preparation for the following commit. Also rename it to SignatureCache.

TheCharlatan force-pushed the noGlobalScriptCache branch from c1d6e52 to 606a7ab Compare July 5, 2024 07:10

dergoegge reviewed Jul 5, 2024

View reviewed changes

stickies-v approved these changes Jul 5, 2024

View reviewed changes

DrahtBot requested a review from glozow July 5, 2024 17:52

glozow reviewed Jul 8, 2024

View reviewed changes

ryanofsky approved these changes Jul 8, 2024

View reviewed changes

ryanofsky merged commit 94d56b9 into bitcoin:master Jul 8, 2024

hebasto added the Needs CMake port label Jul 13, 2024

hebasto mentioned this pull request Jul 14, 2024

cmake: Regular rebasing of the cmake-staging branch hebasto/bitcoin#264

Closed

hebasto removed the Needs CMake port label Jul 14, 2024

bitcoin locked and limited conversation to collaborators Jul 14, 2025

		g_scriptExecutionCacheHasher.Write(nonce.begin(), 32);
		g_scriptExecutionCacheHasher.Write(nonce.begin(), 32);

kernel: De-globalize validation caches #30141

kernel: De-globalize validation caches #30141

Uh oh!

Conversation

TheCharlatan commented May 19, 2024

Uh oh!

DrahtBot commented May 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

ajtowns left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

theuni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheCharlatan commented May 21, 2024

Uh oh!

Uh oh!

TheCharlatan commented May 21, 2024

Uh oh!

TheCharlatan commented Jun 18, 2024

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheCharlatan commented Jun 19, 2024

Uh oh!

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

maflcko commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ryanofsky commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryanofsky commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maflcko commented Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stickies-v Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheCharlatan commented Jul 5, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented May 19, 2024 •

edited

Loading

maflcko commented Jul 3, 2024 •

edited

Loading

ryanofsky commented Jul 3, 2024 •

edited

Loading

ryanofsky commented Jul 3, 2024 •

edited

Loading

maflcko commented Jul 3, 2024 •

edited

Loading

stickies-v Jul 4, 2024 •

edited

Loading