Improve peformance of CTransaction::HasWitness (28107 follow-up) #28766

dergoegge · 2023-11-01T11:03:18Z

This addresses #28107 (comment) from #28107.

DrahtBot · 2023-11-01T11:03:21Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	theStack, stickies-v, TheCharlatan, achow101
Concept ACK	glozow

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

dergoegge · 2023-11-01T11:56:20Z

Can someone please re-run the "CI / macOS 13 native" job, failure seems unrelated.

fanquake · 2023-11-01T11:58:30Z

failure seems unrelated.

Are you sure?

 Traceback (most recent call last):
  File "/Users/runner/work/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-apple-darwin/test/functional/test_framework/test_framework.py", line 132, in main
    self.run_test()
  File "/Users/runner/work/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-apple-darwin/test/functional/rpc_createmultisig.py", line 65, in run_test
    self.do_multisig()
  File "/Users/runner/work/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-apple-darwin/test/functional/rpc_createmultisig.py", line 241, in do_multisig
    tx = node0.sendrawtransaction(rawtx3["hex"], 0)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/runner/work/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-apple-darwin/test/functional/test_framework/coverage.py", line 50, in __call__
    return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/runner/work/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-apple-darwin/test/functional/test_framework/authproxy.py", line 129, in __call__
    raise JSONRPCException(response['error'], status)
test_framework.authproxy.JSONRPCException: mandatory-script-verify-flag-failed (Witness program was passed an empty witness) (-26)

dergoegge · 2023-11-01T12:01:35Z

Are you sure?

Yes, it passes locally.

fanquake · 2023-11-01T12:04:33Z

Want to document it as an intermittent failure?

stickies-v

Approach ACK

I had similar thoughts while reviewing 28107 but figured the computation was still fast enough. Given that ComputeWitnessHash is only called once during initialization (and is private) and HasWitness is called frequently, this approach is strictly better.

src/primitives/transaction.cpp

fanquake · 2023-11-01T13:20:59Z

The rpc_createmultisig.py test is now failing in two other CI jobs and MSAN is unhappy:

==19201==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x563f069e85d0 in bcmp /msan/llvm-project/compiler-rt/lib/msan/../sanitizer_common/sanitizer_common_interceptors.inc:856:10
    #1 0x563f0956bff5 in base_blob<256u>::Compare(base_blob<256u> const&) const src/./uint256.h:54:66
    #2 0x563f0956bff5 in operator!=(base_blob<256u> const&, base_blob<256u> const&) src/./uint256.h:57:89
    #3 0x563f0956bff5 in CTransaction::HasWitness() const src/./primitives/transaction.h:370:33
    #4 0x563f0956bff5 in void SerializeTransaction<CHashWriter, CTransaction>(CTransaction const&, CHashWriter&) src/./primitives/transaction.h:265:16
    #5 0x563f09569863 in void CTransaction::Serialize<CHashWriter>(CHashWriter&) const src/./primitives/transaction.h:326:9
    #6 0x563f09569863 in void Serialize<CHashWriter, CTransaction>(CHashWriter&, CTransaction const&) src/./serialize.h:776:7
    #7 0x563f09569863 in CHashWriter& CHashWriter::operator<<<CTransaction>(CTransaction const&) src/./hash.h:161:9
    #8 0x563f09569863 in CTransaction::ComputeWitnessHash() const src/primitives/transaction.cpp:93:47
    #9 0x563f09569f47 in CTransaction::CTransaction(CMutableTransaction&&) src/primitives/transaction.cpp:97:190
    #10 0x563f06eb3f68 in std::__1::__shared_ptr_emplace<CTransaction const, std::__1::allocator<CTransaction const>>::__shared_ptr_emplace[abi:v170002]<CMutableTransaction>(std::__1::allocator<CTransaction const>, CMutableTransaction&&) /msan/cxx_build/include/c++/v1/__memory/shared_ptr.h:302:37
    #11 0x563f06eb3f68 in std::__1::shared_ptr<CTransaction const> std::__1::allocate_shared[abi:v170002]<CTransaction const, std::__1::allocator<CTransaction const>, CMutableTransaction, void>(std::__1::allocator<CTransaction const> const&, CMutableTransaction&&) /msan/cxx_build/include/c++/v1/__memory/shared_ptr.h:1022:55
    #12 0x563f09071b26 in std::__1::shared_ptr<CTransaction const> std::__1::make_shared[abi:v170002]<CTransaction const, CMutableTransaction, void>(CMutableTransaction&&) /msan/cxx_build/include/c++/v1/__memory/shared_ptr.h:1031:12
    #13 0x563f09071b26 in std::__1::shared_ptr<CTransaction const> MakeTransactionRef<CMutableTransaction>(CMutableTransaction&&) src/./primitives/transaction.h:418:93
    #14 0x563f09071b26 in ChainstateManager::UpdateUncommittedBlockStructures(CBlock&, CBlockIndex const*) const src/validation.cpp:3706:24
    #15 0x563f09073e46 in ChainstateManager::GenerateCoinbaseCommitment(CBlock&, CBlockIndex const*) const src/validation.cpp:3733:5
    #16 0x563f08bb9dd6 in node::BlockAssembler::CreateNewBlock(CScript const&) src/node/miner.cpp:160:69
    #17 0x563f0822b2a5 in TestChain100Setup::CreateBlock(std::__1::vector<CMutableTransaction, std::__1::allocator<CMutableTransaction>> const&, CScript const&, Chainstate&) src/test/util/setup_common.cpp:310:56

glozow

Concept ACK

src/primitives/transaction.h

ajtowns · 2023-11-01T13:57:32Z

 #3 0x563f0956bff5 in CTransaction::HasWitness() const src/./primitives/transaction.h:370:33
 #4 0x563f0956bff5 in void SerializeTransaction<CHashWriter, CTransaction>(CTransaction const&, CHashWriter&) src/./primitives/transaction.h:265:16

If it's not obvious, the bug is that m_witness_hash is initialised by serializing the transaction with its witness, but in that case, Serialize calls HasWitness() to work out which format to use, which with this change relies on access to m_witness_hash which is not yet initialised.

dergoegge · 2023-11-01T14:17:45Z

Added a std::optional<bool> to CTransaction as a cache for HasWitness.

stickies-v · 2023-11-01T14:40:09Z

I'm not a fan of the std::optional<bool> cache approach, it's an error-prone API imo (even if it's private and small). I think it makes more sense to just calculate the boolean at initialization and follow the same pattern as for hash and m_witness_hash? It has to be calculated for every instance anyway.

~~As e.g. in stickies-v@2e69a65~~

Edit: using default initializers makes everything a bit more concise: stickies-v@eb316f9 (diff)

dergoegge · 2023-11-01T15:11:45Z

Thanks @stickies-v went with your suggestion as my previous idea was buggy again🔥

ajtowns · 2023-11-01T15:25:10Z

src/primitives/transaction.h

@@ -310,12 +310,15 @@ class CTransaction

 private:
    /** Memory only. */
+    const bool m_has_witness;


Not really a fan of adding 8 bytes to every CTransaction, though I guess it's not horrible?

Another approach would be to separate out the raw data and compute-logic into one class, and the optimised version with the w/txid cached into a separate class. That looks like: ajtowns@89e3c80

Personally, I don't think the extra bytes matter that much, so I would lean towards leaving this as is.

Really, we're already wasting 16 bytes due to vin and vout supporting capacity() != size() despite both being const and never needing to be resized.

theStack · 2023-11-06T15:33:54Z

Concept ACK

theStack

The code looks correct.

I'm still unsure on how to assess whether the extra memory needed is okay or not, other than "gut feeling". I would have expected that the increased size of CTransaction (on my system, it's 128 bytes on the PR vs. 120 bytes on master, a growth of ~6.66%) has a tiny impact on the mempool usage accounting as well, but it seems that it doesn't, as RecursiveDynamicUsage only cares about a tx's inputs and outputs fields:

bitcoin/src/core_memusage.h

Lines 32 to 41 in 8243762

    
           static inline size_t RecursiveDynamicUsage(const CTransaction& tx) { 
        
               size_t mem = memusage::DynamicUsage(tx.vin) + memusage::DynamicUsage(tx.vout); 
        
               for (std::vector<CTxIn>::const_iterator it = tx.vin.begin(); it != tx.vin.end(); it++) { 
        
                   mem += RecursiveDynamicUsage(*it); 
        
               } 
        
               for (std::vector<CTxOut>::const_iterator it = tx.vout.begin(); it != tx.vout.end(); it++) { 
        
                   mem += RecursiveDynamicUsage(*it); 
        
               } 
        
               return mem; 
        
           }

(A bit off-topic to this PR, but is it really intended to only do dynamic accounting? Just because something doesn't grow dynamically doesn't mean it's completely free, the static parts still have to reside somewhere in memory.)

sipa · 2023-11-12T23:04:02Z

@theStack The dynamic memory isn't the only thing that matters, but it's the only thing you don't already account for at another level.

If you have a single transaction, and want its memory usage, use sizeof(transaction) + DynamicMemoryUsage(transaction). It'd be convenient to have the transaction sizeof added to that memory usage. However, things would go wrong if you use a collection then.

Say you have a vector of transactions, and want its memory usage. You'd use sizeof(vector) + dynamic memory of the vector (which consists of the memory allocated by the vector, plus the memory allocated by all of the transactions in it). However, the transactions themselves are in the vector's allocated memory. If they'd be counted alongside their own allocated memory, they'd be counted twice here (and incorrectly, because the overhead of memory allocation only applies once to the entire array, not individually).

theStack · 2023-11-12T23:47:03Z

@sipa: Thanks for clarifying, that makes sense and I see now why it would be a bad idea to add the static part directly in the DynamicMemoryUsage functions. For the specific case of mempool usage accounting, one could add sizeof(CTransaction) at the call-site though? E.g.

diff --git a/src/kernel/mempool_entry.h b/src/kernel/mempool_entry.h
index 1f175a5ccf..95362fe2d9 100644
--- a/src/kernel/mempool_entry.h
+++ b/src/kernel/mempool_entry.h
@@ -108,7 +108,7 @@ public:
         : tx{tx},
           nFee{fee},
           nTxWeight{GetTransactionWeight(*tx)},
-          nUsageSize{RecursiveDynamicUsage(tx)},
+          nUsageSize{sizeof(*tx) + RecursiveDynamicUsage(tx)},
           nTime{time},
           entry_sequence{entry_sequence},
           entryHeight{entry_height},

(Not saying it's really worth it to fiddle around in this area, and I personally wouldn't feel comfortable to open a PR. I was just very surprised that one could -- in theory -- bloat up CTransaction and it still goes unnoticed for the mempool usage accounting.)

sipa · 2023-11-13T01:30:06Z

@theStack (deleted my previous comment, I had missed a lot).

RecursiveDynamicUsage for CTransactionRef (which is a std::shared_ptr<CTransaction>) should already account for both the memory allocated in the shared pointer as well as the memory allocated by the transaction. See src/core_memusage.h:

template<typename X>
static inline size_t RecursiveDynamicUsage(const std::shared_ptr<X>& p) {
    return p ? memusage::DynamicUsage(p) + RecursiveDynamicUsage(*p) : 0;
}

where memusage::DynamicUsage(p) invokes this function from src/memusage.h:

template<typename X>
static inline size_t DynamicUsage(const std::shared_ptr<X>& p)
{
    // A shared_ptr can either use a single continuous memory block for both
    // the counter and the storage (when using std::make_shared), or separate.
    // We can't observe the difference, however, so assume the worst.
    return p ? MallocUsage(sizeof(X)) + MallocUsage(sizeof(stl_shared_counter)) : 0;
}

Is it possible that you just don't see the added field cause changes because either it's fitting in earlier padding space, or because the rounding up effect of malloc() hides the difference?

theStack · 2023-11-13T03:02:23Z

@sipa: Ah, very interesting, have to admit I completely overlooked RecursiveDynamicUsage(const std::shared_ptr<X>& p).

Is it possible that you just don't see the added field cause changes because either it's fitting in earlier padding space, or because the rounding up effect of malloc() hides the difference?

Yes, it was the latter. The added field did lead to a size increase of CTransaction (from 120 to 128 bytes), but MallocUsage indeed yields the same value for both of these sizes on a 64-bit machine (namely ((120+31)>>4)<<4 = ((128+31)>>4)<<4 = 144). So that's the reason why I did see the same mempool usage accounting results both on master and the PR on my machine (and wrongly concluded from that that the size of CTransaction is not accounted for). To be sure, I intentionally bloated CTransaction up much more with an const uint8_t dummy[100] = {}; entry and verified that the mempool usage was higher indeed. So, the accounting seems to work as expected and my worries were unjustified.

If anyone wants to run these experiments: what I did was to start bitcoind with an existing mempool.dat in datadir both on master and the PR branch with -nolisten -noconnect (to avoid clogging the mempool with new txs), wait until the mempool loaded, ran bitcoin-cli getmempoolinfo and compared the "usage" result field value (other fields like "size", "bytes" and "total_fee" should always be equal for the same mempool.dat).

ACK af1d2ff

(still based on a gut feeling, but garnished with some napkin math: the mempool on my node is close to it's maximum of 300MB and has about ~100k entries now; MallocUsage increases in 16-bytes steps on 64-bit machines, so that would in the worst case lead to ~1,6MB more usage, where now "only" ~99.5k entries would fit; I think that's fine).

stickies-v

ACK af1d2ff

I think the PR and commit description would benefit from describing that this is not a strict improvement, as we slightly increase memory usage (which I think is an acceptable trade-off).

stickies-v · 2023-11-13T11:28:39Z

src/primitives/transaction.h

@@ -310,12 +310,15 @@ class CTransaction

 private:
    /** Memory only. */
+    const bool m_has_witness;


nit: Using default initializer makes more sense here I think:

git diff on af1d2ff

diff --git a/src/primitives/transaction.cpp b/src/primitives/transaction.cpp index 77a363f7b6..a20cf25f53 100644 --- a/src/primitives/transaction.cpp +++ b/src/primitives/transaction.cpp @@ -93,8 +93,10 @@ Wtxid CTransaction::ComputeWitnessHash() const return Wtxid::FromUint256((CHashWriter{0} << *this).GetHash()); } -CTransaction::CTransaction(const CMutableTransaction& tx) : vin(tx.vin), vout(tx.vout), nVersion(tx.nVersion), nLockTime(tx.nLockTime), m_has_witness{ComputeHasWitness()}, hash{ComputeHash()}, m_witness_hash{ComputeWitnessHash()} {} -CTransaction::CTransaction(CMutableTransaction&& tx) : vin(std::move(tx.vin)), vout(std::move(tx.vout)), nVersion(tx.nVersion), nLockTime(tx.nLockTime), m_has_witness{ComputeHasWitness()}, hash{ComputeHash()}, m_witness_hash{ComputeWitnessHash()} {} +CTransaction::CTransaction(const CMutableTransaction& tx) + : vin(tx.vin), vout(tx.vout), nVersion(tx.nVersion), nLockTime(tx.nLockTime) {} +CTransaction::CTransaction(CMutableTransaction&& tx) + : vin(std::move(tx.vin)), vout(std::move(tx.vout)), nVersion(tx.nVersion), nLockTime(tx.nLockTime) {} CAmount CTransaction::GetValueOut() const { diff --git a/src/primitives/transaction.h b/src/primitives/transaction.h index 594168bbcc..facbd8c769 100644 --- a/src/primitives/transaction.h +++ b/src/primitives/transaction.h @@ -310,9 +310,9 @@ public: private: /** Memory only. */ - const bool m_has_witness; - const Txid hash; - const Wtxid m_witness_hash; + const bool m_has_witness{ComputeHasWitness()}; + const Txid hash{ComputeHash()}; + const Wtxid m_witness_hash{ComputeWitnessHash()}; Txid ComputeHash() const; Wtxid ComputeWitnessHash() const;

dergoegge · 2023-11-21T10:43:13Z

I think this is rfm. Could a maintainer take a look?

TheCharlatan

ACK af1d2ff

Nit: Maybe add a HasWitness check somewhere in the transaction_tests?

mzumsande · 2023-11-22T20:40:32Z

Since it comes with a memory cost, I wonder if the the performance gain we get for it can be measured. Does this speed up any benchmarks / IBD?

dergoegge · 2023-11-23T10:00:49Z

Since it comes with a memory cost, I wonder if the the performance gain we get for it can be measured. Does this speed up any benchmarks / IBD?

It's less of a gain and more just trying to revert the performance regression I introduced in #28017. I guess you are asking if it is a noticeable regression. I would think both the performance regression and the new memory cost from here are negligible and I can certainly also live with what is in master now.

mzumsande

I guess you are asking if it is a noticeable regression. I would think both the performance regression and the new memory cost from here are negligible and I can certainly also live with what is in master now.

I think this should be easily benched just with a --reindex-chainstate? I could try that next week if no one else wants to. I just think that if it's possible within reasonable efforts, this kind of decisions should be made based on empirical data rather than guessing what the performance / memory impacts may be.

dergoegge · 2023-11-24T17:45:04Z

this kind of decisions should be made based on empirical data rather than guessing what the performance / memory impacts may be.

This makes it sound like this is all up in the air when the trade-off for this PR is well known. All this does is revert to the performance we had pre-#28107 while adding an extra 8 bytes to every CTransaction instance.

The runtime complexity of HasWitness is now O(tx.vin.size()) while it used to be O(1). That could be considered good enough reasoning for this PR even if a IBD/reindex benchmark doesn't show a noticeable difference. The 4 concept acks and 3 full acks seem to agree on that.

In any case the memory usage of CTransaction could be optimized in a follow-up, even beyond reverting the overhead from this PR (#28766 (comment)).

Feel free to do the benchmark, but I'm not sure if the result matters much at this point, since we have already spend sufficient review time on this PR.

dergoegge · 2023-11-24T20:49:31Z

All this does is revert to the performance we had pre-#28107 while adding an extra > 8 bytes to every CTransaction instance.

The runtime complexity of HasWitness is now O(tx.vin.size()) while it used to be O(1).

Whoops, this isn't really true. HasWitness was unchanged by #28107, the only change in this regard was cdb14d7.

This PR improves the performance for all call sites of HasWitness not just the ones in cdb14d7.

achow101 · 2023-11-28T13:44:36Z

ACK af1d2ff

fanquake requested review from ajtowns and glozow November 1, 2023 11:09

stickies-v reviewed Nov 1, 2023

View reviewed changes

src/primitives/transaction.cpp Outdated Show resolved Hide resolved

src/primitives/transaction.cpp Outdated Show resolved Hide resolved

glozow reviewed Nov 1, 2023

View reviewed changes

maflcko reviewed Nov 1, 2023

View reviewed changes

src/primitives/transaction.h Outdated Show resolved Hide resolved

DrahtBot added the CI failed label Nov 1, 2023

dergoegge force-pushed the 2023-10-28107-followup branch from fed7b91 to 6643128 Compare November 1, 2023 14:15

dergoegge force-pushed the 2023-10-28107-followup branch from 6643128 to 6b2bd8d Compare November 1, 2023 15:09

[primitives] Precompute result of CTransaction::HasWitness

af1d2ff

dergoegge force-pushed the 2023-10-28107-followup branch from 6b2bd8d to af1d2ff Compare November 1, 2023 15:15

ajtowns reviewed Nov 1, 2023

View reviewed changes

DrahtBot removed the CI failed label Nov 1, 2023

glozow added the Refactoring label Nov 2, 2023

theStack reviewed Nov 12, 2023

View reviewed changes

DrahtBot requested a review from stickies-v November 13, 2023 03:02

DrahtBot requested a review from glozow November 13, 2023 03:02

stickies-v approved these changes Nov 13, 2023

View reviewed changes

DrahtBot requested review from stickies-v and removed request for stickies-v November 13, 2023 11:49

TheCharlatan approved these changes Nov 22, 2023

View reviewed changes

mzumsande reviewed Nov 24, 2023

View reviewed changes

achow101 merged commit 26b7bcf into bitcoin:master Nov 28, 2023

bitcoin locked and limited conversation to collaborators Nov 27, 2024

	static inline size_t RecursiveDynamicUsage(const CTransaction& tx) {
	size_t mem = memusage::DynamicUsage(tx.vin) + memusage::DynamicUsage(tx.vout);
	for (std::vector<CTxIn>::const_iterator it = tx.vin.begin(); it != tx.vin.end(); it++) {
	mem += RecursiveDynamicUsage(*it);
	}
	for (std::vector<CTxOut>::const_iterator it = tx.vout.begin(); it != tx.vout.end(); it++) {
	mem += RecursiveDynamicUsage(*it);
	}
	return mem;
	}

Improve peformance of CTransaction::HasWitness (28107 follow-up) #28766

Improve peformance of CTransaction::HasWitness (28107 follow-up) #28766

Uh oh!

Conversation

dergoegge commented Nov 1, 2023

Uh oh!

DrahtBot commented Nov 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Uh oh!

dergoegge commented Nov 1, 2023

Uh oh!

fanquake commented Nov 1, 2023

Uh oh!

dergoegge commented Nov 1, 2023

Uh oh!

fanquake commented Nov 1, 2023

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fanquake commented Nov 1, 2023

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ajtowns commented Nov 1, 2023

Uh oh!

dergoegge commented Nov 1, 2023

Uh oh!

stickies-v commented Nov 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dergoegge commented Nov 1, 2023

Uh oh!

ajtowns Nov 1, 2023

Choose a reason for hiding this comment

Uh oh!

dergoegge Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

ajtowns Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

theStack commented Nov 6, 2023

Uh oh!

theStack left a comment

Choose a reason for hiding this comment

Uh oh!

sipa commented Nov 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theStack commented Nov 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sipa commented Nov 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theStack commented Nov 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stickies-v left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stickies-v Nov 13, 2023

Choose a reason for hiding this comment

Uh oh!

dergoegge commented Nov 21, 2023

Uh oh!

TheCharlatan left a comment

Choose a reason for hiding this comment

Uh oh!

mzumsande commented Nov 22, 2023

Uh oh!

dergoegge commented Nov 23, 2023

DrahtBot commented Nov 1, 2023 •

edited

Loading

stickies-v commented Nov 1, 2023 •

edited

Loading

sipa commented Nov 12, 2023 •

edited

Loading

theStack commented Nov 12, 2023 •

edited

Loading

sipa commented Nov 13, 2023 •

edited

Loading

theStack commented Nov 13, 2023 •

edited

Loading

stickies-v left a comment •

edited

Loading

dergoegge commented Nov 24, 2023 •

edited

Loading

dergoegge commented Nov 24, 2023 •

edited

Loading