Skip to content

Conversation

JeremyRubin
Copy link
Contributor

@JeremyRubin JeremyRubin commented Jul 9, 2020

Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.

Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.

It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.

The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).

Simpler ownership model

No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.

Memory Usage

We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.

Performance

If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in so many places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.

The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.

Before:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793

After:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596

The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.

Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306

I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.

Discussion

Why maplinks in the first place?

I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.

Is iterator_to weird?

iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to

iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:

- Interoperability with preexisting APIs based on pointers or references.
- Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
- The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
- Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.

In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).

Edit by laanwj: removed at sign from the description

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 10, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj
Copy link
Member

laanwj commented Jul 15, 2020

This is cool! How much memory would be saved by this change on average, do you think?

@JeremyRubin
Copy link
Contributor Author

JeremyRubin commented Jul 15, 2020

std::sets are awful for memory. I think you can get like a max of 220k txns in the mempool... let's assume that average is 50k? I don't know if that's a good assumption though. the overhead for a map entry is very high. I would guess that it's somewhere around 64 bytes per entry? Can double check, but based on:

 struct stl_tree_node
 {
 private:
     int color;
     void* parent;
     void* left;
     void* right;
     std::pair<txiter, txlinks>;
 };

That's 4 bytes + 4 bytes alignment + 8 * 3 bytes pointers + 8 bytes txiter (ignoring txlinks itself since we still store those in the CTxMempoolEntry, but adding another 8 bytes for alignments and padding) = 6 * 8 = 48 byes of raw overhead. Then with mallocusage, this gives us 64 bytes. So it may not use 64 bytes per element, but we certainly count that many! Depending on how big txlinks is itself, we may have some extra overhead (implementation dependent).

Anyways, let's assume 64 bytes.

64*50,000 bytes = 3.2 MB, 64 * 200,000 bytes = 12.8 MB.

For a 300MB pool, 1-3% memory savings. Note that this doesn't account for the fact that there are other downsides to the memory fragmentation of doing so many allocations, but a pooled allocator could also help with this issue.

(A further change which is maybe worth looking at is the Children themselves -- because there is so much memory overhead, storing a std::vector and std::vector might be more efficient for many txs with small numbers of vins/vouts, especially since you can super optimize checking if a H is in a vector v.s. log N random memory compares)

@JeremyRubin
Copy link
Contributor Author

  template <size_t s1, size_t s2> struct static_check {
       static_check() {
       static_assert(s1 == s2);
       }
   };
       auto s = static_check<48, sizeof(setEntries)>();
       auto s2 = static_check<48*2, sizeof(TxLinks)>();
       auto s3 = static_check<136, sizeof(memusage::stl_tree_node<std::pair<const txiter, TxLinks>>)>();
       auto s4 = static_check<160, ((31 + sizeof(memusage::stl_tree_node<std::pair<const txiter, TxLinks>>)) >> 4)<<4 >();

This confirms that a setEntries uses 48 bytes for the container itself, and the overhead is 64 bytes per node for each TxLinks in the mapLinks data structure compared to inlined TxLinks in CTxMempoolEntry. Even though the nodes themselves are only 136 bytes, the MallocUsage accounting pads out to 160 bytes.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good. I'm unsure what our plans are with respect to Boost features.

ACK dc0d3fb code review and tested rebased on master. The benchmarks show a 20% improvement for me, built with --enable-bench CXXFLAGS="-O2".

Current master 476436b

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 50, 1, 22.3, 0.371758, 0.665461, 0.426088

After, rebased on master

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 50, 1, 18.2, 0.337176, 0.493058, 0.354405

With 5000 transactions

src/bench/mempool_stress.cpp
-    for (auto x = 0; x < 800 && !available_coins.empty(); ++x) {
+    for (auto x = 0; x < 5000 && !available_coins.empty(); ++x) {

Current master 476436b

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 97.3971, 18.5328, 20.4063, 19.4885

After, rebased on master

# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 80.3426, 15.6254, 16.5093, 16.118

@jonatack
Copy link
Member

nit: maybe a less violent PR title 🪓 🙂 👍 (joking, but s/kill/remove/ works) and add the component prefix

@sipa
Copy link
Member

sipa commented Jul 20, 2020

@jonatack You must have hated #9260.

@jonatack
Copy link
Member

Not bad :) The description even required translation. #9260 (comment)

@jonatack
Copy link
Member

One run each of all benchmarks (though I suspect there is a fair amount of variation between runs).

all benchmarks, master vs this branch rebased on master

current master

(master)$ ./src/bench/bench_bitcoin
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 1.72099, 0.063045, 0.0772824, 0.0669075
AddrManGetAddr, 5, 500, 5.81175, 0.0022402, 0.00251569, 0.0022842
AddrManGood, 5, 2, 6.0441, 0.584628, 0.624264, 0.596058
AddrManSelect, 5, 1000000, 2.69665, 5.28107e-07, 5.52293e-07, 5.3419e-07
AssembleBlock, 5, 700, 1.62831, 0.000441994, 0.000487594, 0.000469525
Base58CheckEncode, 5, 320000, 4.00335, 2.45396e-06, 2.58267e-06, 2.48388e-06
Base58Decode, 5, 800000, 4.13891, 9.41483e-07, 1.20126e-06, 9.73678e-07
Base58Encode, 5, 470000, 3.47461, 1.45507e-06, 1.49106e-06, 1.48153e-06
Bech32Decode, 5, 800000, 2.02791, 4.81211e-07, 5.65706e-07, 4.93396e-07
Bech32Encode, 5, 800000, 2.97694, 6.88539e-07, 8.73764e-07, 7.08336e-07
BenchLockedPool, 5, 1300, 6.36001, 0.000945641, 0.00101048, 0.000980838
BenchTimeDeprecated, 5, 100000000, 3.06631, 5.61812e-09, 7.04958e-09, 6.02113e-09
BenchTimeMillis, 5, 6000000, 2.88408, 9.29844e-08, 1.00058e-07, 9.63667e-08
BenchTimeMillisSys, 5, 6000000, 2.92817, 9.08119e-08, 1.17317e-07, 9.40587e-08
BenchTimeMock, 5, 300000000, 3.31172, 2.1875e-09, 2.21985e-09, 2.21037e-09
BlockToJsonVerbose, 5, 10, 4.90092, 0.0940476, 0.100517, 0.0993346
BnBExhaustion, 5, 650, 3.83522, 0.00111645, 0.00123052, 0.00119955
CCheckQueueSpeedPrevectorJob, 5, 1400, 7.23464, 0.000990455, 0.00107279, 0.00103406
CCoinsCaching, 5, 170000, 0.356379, 3.90721e-07, 4.65786e-07, 4.13522e-07
CHACHA20_1MB, 5, 340, 3.97313, 0.00228695, 0.00236048, 0.00235003
CHACHA20_256BYTES, 5, 250000, 0.744309, 5.75571e-07, 6.18917e-07, 5.97898e-07
CHACHA20_64BYTES, 5, 500000, 0.39087, 1.54202e-07, 1.60676e-07, 1.55381e-07
CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT, 5, 340, 11.3753, 0.0064394, 0.00718509, 0.00649879
CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT, 5, 340, 5.56076, 0.00322281, 0.00332228, 0.00327013
CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT, 5, 250000, 2.77848, 2.16111e-06, 2.28038e-06, 2.22017e-06
CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT, 5, 250000, 1.40301, 1.09434e-06, 1.14021e-06, 1.13141e-06
CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT, 5, 500000, 2.62965, 1.01675e-06, 1.09266e-06, 1.05452e-06
CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT, 5, 500000, 1.3327, 5.12329e-07, 5.47454e-07, 5.30224e-07
CoinSelection, 5, 650, 0.755947, 0.000220572, 0.000257613, 0.000226304
ComplexMemPool, 5, 1, 1.96453, 0.377724, 0.41796, 0.389692
ConstructGCSFilter, 5, 1000, 10.173, 0.00198849, 0.00209198, 0.00204057
DeserializeAndCheckBlockTest, 5, 160, 6.35808, 0.0076843, 0.00885282, 0.007737
DeserializeBlockTest, 5, 130, 4.11579, 0.00605016, 0.00697307, 0.00626367
DuplicateInputs, 5, 10, 0.643048, 0.0107885, 0.014601, 0.0130247
FastRandom_1bit, 5, 440000000, 5.20149, 2.31664e-09, 2.42676e-09, 2.33602e-09
FastRandom_32bit, 5, 110000000, 6.10483, 1.08269e-08, 1.13123e-08, 1.11124e-08
HASH_1MB, 5, 340, 6.63799, 0.00383218, 0.00397389, 0.00389316
HASH_256BYTES, 5, 250000, 1.97735, 1.49095e-06, 1.78286e-06, 1.54661e-06
HASH_64BYTES, 5, 500000, 2.0604, 7.66024e-07, 9.86615e-07, 7.89013e-07
MatchGCSFilter, 5, 50000, 7.91817, 2.90019e-05, 3.72023e-05, 3.08161e-05
MempoolEviction, 5, 41000, 9.86147, 4.14122e-05, 5.192e-05, 4.86635e-05
MerkleRoot, 5, 800, 6.25447, 0.00153309, 0.00158666, 0.00156357
POLY1305_1MB, 5, 340, 1.58711, 0.000894605, 0.00101961, 0.000919736
POLY1305_256BYTES, 5, 250000, 0.299447, 2.30486e-07, 2.54892e-07, 2.37903e-07
POLY1305_64BYTES, 5, 500000, 0.189159, 7.41961e-08, 7.89804e-08, 7.44001e-08
PrePadded, 5, 10000, 0.0126671, 2.47568e-07, 2.73122e-07, 2.48886e-07
PrevectorClearNontrivial, 5, 28300, 9.39727, 6.36911e-05, 6.8178e-05, 6.71331e-05
PrevectorClearTrivial, 5, 88600, 23.0503, 5.17078e-05, 5.25987e-05, 5.19458e-05
PrevectorDeserializeNontrivial, 5, 6800, 5.32902, 0.00013906, 0.000195636, 0.000142078
PrevectorDeserializeTrivial, 5, 52000, 3.72396, 1.33016e-05, 1.64363e-05, 1.39542e-05
PrevectorDestructorNontrivial, 5, 28800, 8.92926, 5.91341e-05, 6.46597e-05, 6.28898e-05
PrevectorDestructorTrivial, 5, 88900, 7.11175, 1.53554e-05, 1.63233e-05, 1.61115e-05
PrevectorResizeNontrivial, 5, 28900, 5.94407, 3.94754e-05, 4.21574e-05, 4.12122e-05
PrevectorResizeTrivial, 5, 90300, 5.85012, 1.26935e-05, 1.35717e-05, 1.28682e-05
RIPEMD160, 5, 440, 6.42822, 0.00282782, 0.00305228, 0.00292482
RegularPadded, 5, 10000, 0.0279986, 4.90113e-07, 6.5017e-07, 5.60703e-07
RollingBloom, 5, 1500000, 5.31282, 6.39848e-07, 7.7822e-07, 6.86684e-07
RollingBloomReset, 5, 20000, 7.80648, 7.52528e-05, 8.27285e-05, 7.69287e-05
RpcMempool, 5, 40, 2.98067, 0.0144245, 0.0157567, 0.0149161
SHA1, 5, 570, 6.74604, 0.00219204, 0.00275459, 0.00231692
SHA256, 5, 340, 6.32523, 0.00362398, 0.00384323, 0.00371196
SHA256D64_1024, 5, 7400, 5.96159, 0.000154711, 0.000171021, 0.000159504
SHA256_32b, 5, 4700000, 6.35779, 2.57318e-07, 2.83581e-07, 2.7218e-07
SHA512, 5, 330, 6.03323, 0.00348758, 0.00388368, 0.00367868
SipHash_32b, 5, 40000000, 6.60139, 3.16447e-08, 3.39715e-08, 3.30938e-08
Sleep100ms, 5, 10, 5.00574, 0.100082, 0.100135, 0.100122
Trig, 5, 12000000, 1.0619, 1.55502e-08, 2.09141e-08, 1.66462e-08
VerifyNestedIfScript, 5, 100, 0.0642893, 0.000114644, 0.000145037, 0.000127313
VerifyScriptBench, 5, 6300, 5.78638, 0.00017174, 0.000196193, 0.000180358
WalletBalanceClean, 5, 8000, 1.91247, 4.34781e-05, 5.36606e-05, 4.7943e-05
WalletBalanceDirty, 5, 2500, 3.90809, 0.000291278, 0.000342813, 0.00031049
WalletBalanceMine, 5, 16000, 1.77235, 2.108e-05, 2.35293e-05, 2.18813e-05
WalletBalanceWatch, 5, 8000, 1.88894, 4.62994e-05, 4.81851e-05, 4.72744e-05

this branch, rebased to master

(pr/19478)$ ./src/bench/bench_bitcoin
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 1.64512, 0.0651763, 0.0661761, 0.0659298
AddrManGetAddr, 5, 500, 6.3247, 0.00239997, 0.00279001, 0.0024759
AddrManGood, 5, 2, 6.62272, 0.629379, 0.739089, 0.634495
AddrManSelect, 5, 1000000, 3.32086, 5.37746e-07, 9.32109e-07, 5.66366e-07
AssembleBlock, 5, 700, 1.5542, 0.000426625, 0.000465905, 0.000444475
Base58CheckEncode, 5, 320000, 4.17478, 2.46738e-06, 2.91027e-06, 2.561e-06
Base58Decode, 5, 800000, 3.85293, 9.4571e-07, 9.81843e-07, 9.65822e-07
Base58Encode, 5, 470000, 3.51497, 1.43368e-06, 1.5361e-06, 1.52626e-06
Bech32Decode, 5, 800000, 2.25873, 5.30087e-07, 6.44574e-07, 5.52377e-07
Bech32Encode, 5, 800000, 3.19484, 7.32648e-07, 8.20117e-07, 8.15208e-07
BenchLockedPool, 5, 1300, 7.35172, 0.000980906, 0.00140676, 0.00106936
BenchTimeDeprecated, 5, 100000000, 3.30681, 6.27607e-09, 7.04851e-09, 6.62835e-09
BenchTimeMillis, 5, 6000000, 2.84261, 9.23429e-08, 9.91789e-08, 9.44398e-08
BenchTimeMillisSys, 5, 6000000, 2.97928, 9.24101e-08, 1.07893e-07, 9.93668e-08
BenchTimeMock, 5, 300000000, 3.33207, 2.18275e-09, 2.28349e-09, 2.21065e-09
BlockToJsonVerbose, 5, 10, 5.00496, 0.0960341, 0.105335, 0.100521
BnBExhaustion, 5, 650, 3.97426, 0.00111834, 0.00137522, 0.00117855
CCheckQueueSpeedPrevectorJob, 5, 1400, 7.85864, 0.00100836, 0.00120258, 0.00111514
CCoinsCaching, 5, 170000, 0.416761, 3.99718e-07, 6.26237e-07, 4.6736e-07
CHACHA20_1MB, 5, 340, 4.15272, 0.00232247, 0.00256342, 0.00241795
CHACHA20_256BYTES, 5, 250000, 0.821719, 6.04351e-07, 7.16052e-07, 6.47608e-07
CHACHA20_64BYTES, 5, 500000, 0.392367, 1.55135e-07, 1.60704e-07, 1.55682e-07
CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT, 5, 340, 11.2433, 0.0064071, 0.00696151, 0.00653254
CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT, 5, 340, 5.67251, 0.00320464, 0.00347243, 0.00329092
CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT, 5, 250000, 2.72123, 2.14751e-06, 2.23938e-06, 2.16475e-06
CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT, 5, 250000, 1.42395, 1.07704e-06, 1.25263e-06, 1.10776e-06
CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT, 5, 500000, 2.51465, 9.83994e-07, 1.0357e-06, 9.93211e-07
CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT, 5, 500000, 1.28575, 5.03617e-07, 5.34522e-07, 5.09937e-07
CoinSelection, 5, 650, 0.754373, 0.00021323, 0.000260981, 0.000223156
ComplexMemPool, 5, 1, 1.72752, 0.329097, 0.376804, 0.341924
ConstructGCSFilter, 5, 1000, 10.5092, 0.00204965, 0.00214565, 0.00210984
DeserializeAndCheckBlockTest, 5, 160, 6.12895, 0.00726007, 0.00793475, 0.0076974
DeserializeBlockTest, 5, 130, 4.25455, 0.00636748, 0.00672154, 0.00647489
DuplicateInputs, 5, 10, 0.632548, 0.0107695, 0.0151279, 0.0130934
FastRandom_1bit, 5, 440000000, 5.37446, 2.20134e-09, 2.76454e-09, 2.4326e-09
FastRandom_32bit, 5, 110000000, 6.259, 1.07203e-08, 1.26064e-08, 1.11061e-08
HASH_1MB, 5, 340, 6.68846, 0.0037417, 0.00407083, 0.00404221
HASH_256BYTES, 5, 250000, 1.93397, 1.43203e-06, 1.61792e-06, 1.55341e-06
HASH_64BYTES, 5, 500000, 1.9273, 7.53427e-07, 7.92554e-07, 7.69106e-07
MatchGCSFilter, 5, 50000, 7.41827, 2.9138e-05, 3.08851e-05, 2.91944e-05
MempoolEviction, 5, 41000, 8.62135, 4.0131e-05, 4.33651e-05, 4.2242e-05
MerkleRoot, 5, 800, 6.1202, 0.00146477, 0.00158029, 0.00154912
POLY1305_1MB, 5, 340, 1.60954, 0.00093638, 0.000971838, 0.000939872
POLY1305_256BYTES, 5, 250000, 0.299708, 2.3101e-07, 2.53748e-07, 2.39605e-07
POLY1305_64BYTES, 5, 500000, 0.199672, 7.49288e-08, 8.58498e-08, 7.96427e-08
PrePadded, 5, 10000, 0.0145195, 2.46289e-07, 3.66109e-07, 2.7668e-07
PrevectorClearNontrivial, 5, 28300, 9.12451, 6.1713e-05, 6.81542e-05, 6.43679e-05
PrevectorClearTrivial, 5, 88600, 23.7796, 5.16061e-05, 5.61612e-05, 5.26222e-05
PrevectorDeserializeNontrivial, 5, 6800, 4.70353, 0.000125809, 0.000155278, 0.0001387
PrevectorDeserializeTrivial, 5, 52000, 4.00423, 1.4031e-05, 1.73909e-05, 1.49766e-05
PrevectorDestructorNontrivial, 5, 28800, 10.0167, 6.71979e-05, 7.25358e-05, 6.8967e-05
PrevectorDestructorTrivial, 5, 88900, 6.70445, 1.45854e-05, 1.55963e-05, 1.51922e-05
PrevectorResizeNontrivial, 5, 28900, 5.86289, 3.87845e-05, 4.19374e-05, 4.1117e-05
PrevectorResizeTrivial, 5, 90300, 6.17383, 1.30828e-05, 1.48376e-05, 1.35303e-05
RIPEMD160, 5, 440, 6.65392, 0.00289608, 0.00322838, 0.00296501
RegularPadded, 5, 10000, 0.024881, 4.61499e-07, 5.36864e-07, 4.99317e-07
RollingBloom, 5, 1500000, 5.26776, 6.34446e-07, 7.91226e-07, 6.99945e-07
RollingBloomReset, 5, 20000, 8.06222, 7.65343e-05, 8.62762e-05, 7.88462e-05
RpcMempool, 5, 40, 2.85708, 0.0141566, 0.0145827, 0.0142359
SHA1, 5, 570, 6.69593, 0.00221977, 0.00245258, 0.00232601
SHA256, 5, 340, 6.37176, 0.00357123, 0.00393608, 0.0037714
SHA256D64_1024, 5, 7400, 5.96427, 0.000156924, 0.000165802, 0.000161775
SHA256_32b, 5, 4700000, 6.34025, 2.61224e-07, 2.77951e-07, 2.69953e-07
SHA512, 5, 330, 5.99635, 0.0035573, 0.00368859, 0.0036532
SipHash_32b, 5, 40000000, 6.50056, 3.13885e-08, 3.40995e-08, 3.22946e-08
Sleep100ms, 5, 10, 5.00548, 0.100081, 0.100137, 0.100105
Trig, 5, 12000000, 1.15828, 1.48329e-08, 2.88076e-08, 1.67105e-08
VerifyNestedIfScript, 5, 100, 0.0523492, 8.93512e-05, 0.000123463, 0.000103065
VerifyScriptBench, 5, 6300, 5.64757, 0.000166801, 0.00019268, 0.000176785
WalletBalanceClean, 5, 8000, 1.95422, 4.57885e-05, 5.51424e-05, 4.67078e-05
WalletBalanceDirty, 5, 2500, 3.97355, 0.000295304, 0.000335195, 0.000318887
WalletBalanceMine, 5, 16000, 1.9643, 2.28496e-05, 2.7418e-05, 2.38308e-05
WalletBalanceWatch, 5, 8000, 2.03689, 4.80869e-05, 5.49796e-05, 4.97744e-05

@JeremyRubin JeremyRubin changed the title Kill mapLinks data structure in Mempool Remove CTxMempool::mapLinks data structure member Jul 20, 2020
@@ -172,16 +172,17 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry, setEntr
// If we're not searching for parents, we require this to be an
// entry in the mempool already.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we say "we require" in this comment, previously we had an assertion that this was true, this patch removes this assertion. This should be safe to do, as it would have previously resulted in crashes. But if we want the assertion checked, we should add back a lookup to mapTX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because previously we had to consistency check constantly to make sure we weren't looking at a dead mempool entry https://github.com/bitcoin/bitcoin/pull/19478/files#diff-ca74c4b28865382863b8fe7633a85cd6L987

Maybe one way to get around this would be to make CTxMemPoolEntry contain bool "dead" if the entry has been evicted but a valid reference is dangling somewhere? This is unconditionally a bug, but then we could add asserts back to CTxMempoolEntry::GetMempool{Children,Parents}{,Const}.

I think this can be handled in follow up work though, as the major need for this consistency check was a fraught ownership model in mapLinks.

@JeremyRubin
Copy link
Contributor Author

@jonatack thanks for the review! Cool to see it making an even bigger impact on other people's systems -- can you share your hardware specs? Mine were run on AMD Ryzen 7 1800X Eight-Core, 3.6ghz, 32 gb ram 2133 MT/s. It generally makes sense that if you're on lesser hardware (more total time on your benches) there may be some other factors contributing to a bigger speedup.

W.r.t. boost features, iterator_to is a member function of multiindex, so it's not a new feature it's an existing API. In any case, it has been in use in the mempool since 2015 on line 174 60de0d5, so it should be fine.

All nits have been addressed.

@jonatack
Copy link
Member

jonatack commented Jul 21, 2020

Intel i7-6500U 4 core @ 2.5GHz, 32 gb RAM DDR4 2666, SSD nvme Samsung 960 Pro 1T, running Debian bullseye/sid with Coreboot (I need to upgrade to a faster CPU but it's not likely to happen soon).

Code review re-ACK 2a47af4 per git diff dc0d3fb 2a47af4

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.
Tested 2a47af4 on Linux Mint 20 (x86_64).

  1. The first commit c977ecc "Part 1/2..." is not compiled. I think we should avoid such situations in the git history.

  2. I strictly prefer that new code follows the coding style, therefore mind applying clang-format-diff.py to it?

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-clean-split-3 branch 3 times, most recently from 09832f5 to 74a780a Compare July 22, 2020 17:53
@JeremyRubin
Copy link
Contributor Author

Ok I've gone ahead and squashed all the nits and intermittent commits so that bisecting should work cleanly.

I think the resultant history is tad harder to review than with the two parts, as there was a bigger delineation from the architectural refactoring, the callsite changes, and just style changes to touched code, but it is what it is.

@jonatack you'll notice some formatting changes when comparing to the prior commit.

git diff 2a47af4 74a780a

diff --git a/src/txmempool.h b/src/txmempool.h
index 8050eaefa..a3a0197d9 100644
--- a/src/txmempool.h
+++ b/src/txmempool.h
@@ -50,13 +50,16 @@ struct LockPoints
 };
 
 struct CompareIteratorByHashGeneric {
-    // SFINAE for a is a pointer or a reference
-    template<typename T>
-    bool operator()(const std::reference_wrapper<T> &a, const std::reference_wrapper<T> &b) const {
+    // SFINAE for T where T is either a pointer type (e.g., a txiter) or a reference_wrapper<T>
+    // (e.g. a wrapped CTxMemPoolEntry&)
+    template <typename T>
+    bool operator()(const std::reference_wrapper<T>& a, const std::reference_wrapper<T>& b) const
+    {
         return a.get().GetTx().GetHash() < b.get().GetTx().GetHash();
     }
-    template<typename T>
-    bool operator()(const T &a, const T &b) const {
+    template <typename T>
+    bool operator()(const T& a, const T& b) const
+    {
         return a->GetTx().GetHash() < b->GetTx().GetHash();
     }
 };
@@ -79,6 +82,7 @@ public:
     // two aliases, should the types ever diverge
     typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Parents;
     typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Children;
+
 private:
     const CTransactionRef tx;
     mutable Parents m_parents;
@@ -145,10 +149,10 @@ public:
     CAmount GetModFeesWithAncestors() const { return nModFeesWithAncestors; }
     int64_t GetSigOpCostWithAncestors() const { return nSigOpCostWithAncestors; }
 
-    const Parents& GetMemPoolParentsConst() const {return m_parents;}
-    const Children& GetMemPoolChildrenConst() const {return m_children;}
-    Parents& GetMemPoolParents() const {return m_parents;}
-    Children& GetMemPoolChildren() const {return m_children;}
+    const Parents& GetMemPoolParentsConst() const { return m_parents; }
+    const Children& GetMemPoolChildrenConst() const { return m_children; }
+    Parents& GetMemPoolParents() const { return m_parents; }
+    Children& GetMemPoolChildren() const { return m_children; }
 
     mutable size_t vTxHashesIdx; //!< Index in mempool's vTxHashes
     mutable uint64_t m_epoch; //!< epoch when last touched, useful for graph algorithms
@@ -547,7 +551,6 @@ public:
     std::vector<std::pair<uint256, txiter>> vTxHashes GUARDED_BY(cs); //!< All tx witness hashes/entries in mapTx, in random order
 
     typedef std::set<txiter, CompareIteratorByHashGeneric> setEntries;
-    typedef std::vector<txiter> vecEntries;
 
     uint64_t CalculateDescendantMaximum(txiter entry) const EXCLUSIVE_LOCKS_REQUIRED(cs);
 private:

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 74a780a, tested on Linux Mint 20 (x86_64).

Benchmarking results:

$ ./src/bench/bench_bitcoin -filter="ComplexMemPool"
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.42105, 0.283276, 0.286269, 0.283499
$ ./src/bench/bench_bitcoin -filter="ComplexMemPool"
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.21549, 0.240644, 0.246652, 0.243024

@jonatack
Copy link
Member

jonatack commented Sep 3, 2020

The wallet_basic.py --descriptors Travis test failure is unrelated. The test passes for me locally on this branch and I've seen the same failure, in one Travis job only, in other PRs reviewed today.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 3ba1665, only rebased and renamed s/CompareIteratorByHashGeneric/CompareIteratorByHash/ since my previous review.

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-clean-split-3 branch from 3ba1665 to 296be8f Compare September 4, 2020 16:47
@JeremyRubin
Copy link
Contributor Author

Since #19854 is merged, this conflicts and has been rebased. Apologies to @jonatack and @hebasto, but mind re-slapping an ACK on this when tests pass? diff is just adding AssertLockHeld(cs) to UpdateChild/UpdateParent, which would not cleanly merge with #19854.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 296be8f, only rebased since my previous review (verified with git range-diff).

@jonatack
Copy link
Member

jonatack commented Sep 4, 2020

re-ACK 296be8f per git range-diff ab338a19 3ba1665 296be8f, sanity check gcc 10.2 debug build is clean.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason the commits aren't squashed? I left some style nits, but feel free to ignore the stupid ones. Though, it would be good to fixup the documentation ones. Also those:

git grep --extended-regexp '(setMemPool|mapLinks)'

Also checked that the dynamic size doesn't increase:

diff --git a/src/bench/mempool_stress.cpp b/src/bench/mempool_stress.cpp
index 89233e390c..7b07f7723e 100644
--- a/src/bench/mempool_stress.cpp
+++ b/src/bench/mempool_stress.cpp
@@ -86,6 +86,8 @@ static void ComplexMemPool(benchmark::Bench& bench)
         for (auto& tx : ordered_coins) {
             AddTx(tx, pool);
         }
+        std::cout << pool.DynamicMemoryUsage() << std::endl;
+        assert(false);
         pool.TrimToSize(pool.DynamicMemoryUsage() * 3 / 4);
         pool.TrimToSize(GetVirtualTransactionSize(*ordered_coins.front()));
     });
Before: 3499728
After:  3442128

a76734e4471002f0643bef86fbbea9a2da187f4e

@@ -55,45 +55,45 @@ size_t CTxMemPoolEntry::GetTxSize() const
}

// Update the given tx for any in-mempool descendants.
// Assumes that setMemPoolChildren is correct for the given tx and all
// Assumes that CTxMemPool::m_children is correct for the given tx and all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Assumes that CTxMemPool::m_children is correct for the given tx and all
// Assumes that CTxMemPoolEntry::m_children is correct for the given tx and all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch on the m_children.

@@ -119,7 +119,7 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
// Iterate in reverse, so that whenever we are looking at a transaction
// we are sure that all in-mempool descendants have already been processed.
// This maximizes the benefit of the descendant cache and guarantees that
// setMemPoolChildren will be updated, an assumption made in
// CTxMemPool::m_children will be updated, an assumption made in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -128,8 +128,8 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
continue;
}
auto iter = mapNextTx.lower_bound(COutPoint(hash, 0));
// First calculate the children, and update setMemPoolChildren to
// include them, and update their setMemPoolParents to include this tx.
// First calculate the children, and update CTxMemPool::m_children to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

// Here we only update statistics and not data in mapLinks (which
// we need to preserve until we're finished with all operations that
// need to traverse the mempool).
// Here we only update statistics and not data in CTxMemPool::Parents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@JeremyRubin
Copy link
Contributor Author

@MarcoFalke do you have a preference for how some of these are handled? I think because it's all non-functional style related or documentary concerns it would be OK to do post-merge as there's a lot of other pending work that will touch/redo these areas eventually in any case.

as to why it is 2 commits and not one, it used to be more commits, and then I slimmed it down. It previously made more sense as the prior patches were something like: 1) Introduce new API 2) Use New API 3) Delete old API, but 1 + 2 got squashed by request since there were a couple issues with the API changing the underlying return type making patch 1+2 fail to compile (even though readable). I don't care strongly, but have a mild preference for not burning out the extant reviewers.

@maflcko
Copy link
Member

maflcko commented Sep 7, 2020

No preference on how those are handled, but re-ACKing some doc fixups should cost as much (or as little) review effort as creating a new commit/pull and getting that reviewed, no?

@JeremyRubin
Copy link
Contributor Author

No because there's already outstanding work (I have like something in the range of 40-70 other commits staged since last year for the mempool project) that is going to touch these areas and cause me to have to rewrite some of the documentation anyways before they are PR ready. I'm not making a separate docs only fixup.

So for the most part the style changes and docs changes are already going to have to be re-reviewed when those follow on changes land.

Originally I intended not to update much documentation along the way (except where critical) because the mempool documentation is already very stale, but to honor some requests to improve it incrementally I added more stuff here. This points to why "just fixing up the docs" isn't really an option, they're so stale that any effort to improve them is inevitably going to bikeshed more because they're unclear and people want something easier to read.

@maflcko maflcko merged commit 2583966 into bitcoin:master Sep 7, 2020
@maflcko
Copy link
Member

maflcko commented Sep 7, 2020

I wouldn't call that bikeshedding. Referring to CTxMemPoolEntry::m_children with CTxMemPool::m_children is objectively wrong and someone will create pull requests to fix it. But if the documentation/code goes away later then it might not be worth it to spend much time on it. Though, for reference it would be nice if you could link to pull requests that rework/remove the docs.

@JeremyRubin
Copy link
Contributor Author

Thanks for the merge -- if someone wants to clean up the docs that's fine.

Yes, the docs were objectively wrong, I probably fucked up the autocomplete when changing it, and if someone wants to fix it before I get around to it that's dandy. Just noting that the prior name -- setMemPoolChildren -- was wrong (no such member existed) for like 5 years and no one fixed it (and even more wrong -- there was no field anywhere), so I sort of doubt anyone is actually reading the docs to learn how the code works anyways.

It needs a decent amount of rebasing and extracting specific chunks but I'm picking out patches from #17268 and JeremyRubin#5. As I do that, a lot of the function signatures will change which is a good time to update the docs to be more coherent.

@maflcko
Copy link
Member

maflcko commented Sep 7, 2020

For my own reference, the broken docs are

git grep --extended-regexp '(setMemPool|mapLinks|CTxMemPool::(Parents|m_children))'

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 9, 2020
296be8f Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin)
46d955d Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin)

Pull request description:

  Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.

  Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.

  It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.

  The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).

  ### Simpler ownership model
  No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.

  ### Memory Usage
  We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.

  ### Performance
  If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.

  The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
  ```
  Before:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793

  After:
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
  ```
  The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.

  Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

  ```
  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068

  # Benchmark, evals, iterations, total, min, max, median
  ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
  ```
  I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.

  # Discussion
  ## Why maplinks in the first place?

  I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.

  ## Is iterator_to weird?

  iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to

  >  iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:
  >
  >     - Interoperability with preexisting APIs based on pointers or references.
  >     - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
  >     - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
  >     - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.

  In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).

  Edit by laanwj: removed at sign from the description

ACKs for top commit:
  jonatack:
    re-ACK 296be8f per `git range-diff ab338a1 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean.
  hebasto:
    re-ACK 296be8f, only rebased since my [previous](bitcoin#19478 (review)) review (verified with `git range-diff`).

Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Mar 14, 2021
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Mar 14, 2021
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Mar 14, 2021
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Mar 14, 2021
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Mar 23, 2021
kiminuo added a commit to kiminuo/bitcoin that referenced this pull request Sep 2, 2021
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Sep 27, 2021
…erasure

Summary:
PR description:
> Remove CTxMempool::mapLinks data structure member
>
> Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.
>
> Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.
>
> It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.
>
> The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).
>
> Simpler ownership model
>
> No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.
>
> Memory Usage
>
> We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.
>
> Performance
>
> If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in so many places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.
>
> The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
>
> The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.
>
> Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

This is a backport of [[bitcoin/bitcoin#19478 | core#19478]]

Test Plan:
`ninja all check-all`

Performance improvement confirmed with `bitcoin-bench`:

```
$ diff before.log after.log  | grep ComplexMemPool
  |             ns/elem |              elem/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
< |      199,336,736.00 |                5.02 |    0.4% |      2.21 | `ComplexMemPool`
> |      181,013,454.00 |                5.52 |    0.4% |      1.99 | `ComplexMemPool`
```

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10191
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
@hebasto
Copy link
Member

hebasto commented Oct 8, 2022

I wouldn't call that bikeshedding. Referring to CTxMemPoolEntry::m_children with CTxMemPool::m_children is objectively wrong and someone will create pull requests to fix it. But if the documentation/code goes away later then it might not be worth it to spend much time on it. Though, for reference it would be nice if you could link to pull requests that rework/remove the docs.

Fixed in #26281.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants