-
Notifications
You must be signed in to change notification settings - Fork 37.8k
[kernel 2e/n] miner: Make mempool
optional, stop constructing temporary empty mempools
#25223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kernel 2e/n] miner: Make mempool
optional, stop constructing temporary empty mempools
#25223
Conversation
mempool
optional, stop constructing temporary empty mempoolsmempool
optional, stop constructing temporary empty mempools
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Can you explain what this change has to do with the kernel? I don't see any modifications in the kernel folder or the kernel build target.s |
I'm also not clear on how making BlockAssembler constructor not require an empty mempool relates to making kernel library better, or lets it drop dependencies. (This still does seem like a positive change, just connection is unclear)
Maybe you already know this, but one thing that I wasn't expecting looking at Carl's branch is that it mostly doesn't move source code into |
Ah yes, as the companion PR to #25215, this PR also "reduces the number of call sites where we explicitly construct CTxMemPool. This is done in preparation for later PRs which decouple the mempool module from ArgsManager..." One of the challenges I found with making the upcoming As an example: When we stop asking I reasoned that since these |
Yeah I think that's right. Mostly I just don't think moving things around and changing namespaces for everything needs to be done right now. Looking at past reviews it also seems that people expect things in |
SkipMapTxEntry is a short helper function that's only used in addPackageTxs, we can just inline it, keep the comments, and avoid the unnecessary interface and lock annotations.
Since UpdatePackagesForAdded is a helper function that's only used in addPackageTxs we can make it static and avoid the unnecessary interface and in-header lock annotation.
f159dbb
to
10827ac
Compare
I also wanted to ask what libbitcoinkernel has to do with mining. And if there is mining in it, why should the mempool be optional. |
libbitcoinkernel doesn't have anything to do with mining. However, mining does require (prior to this PR) a mempool. In many cases, these mempool are sometimes constructed on-the-fly (not as part of init). For example: Lines 357 to 358 in 640eb77
Therefore, when I start making CTxMemPool::Options mempool_opts{};
OverlayArgsManSettings(mempool_opts);
CTxMemPool empty_mempool{mempool_opts};
std::unique_ptr<CBlockTemplate> blocktemplate(BlockAssembler{chainman.ActiveChainstate(), empty_mempool}.CreateNewBlock(coinbase_script)); Now, it would be another story if any of these |
Wouldn't it be better to move mining out of libbitcoinkernel and not motivate this pull request on
So an alternative would be to make |
Mining is not in
Yeah I suppose so, but it's also fraught with danger: Does the
|
At least for the purposes of the miner this should not be the case. It will result in an empty block regardless of the options. Personally I think it is fine to fall back to the defaults. After all this is also what happens with gArgs when no options are passed to the ArgsManager. Though, I am also fine with your approach. |
10827ac
to
faddefe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, agree with narrowing the interface for constructing mempools. In a number of ways, CTxMemPool
is dependent on the caller to manage things for it; it requires somebody else to ask gArgs for maxmempool and call TrimToSize
, get the active chainstate and pass in the coinsview to do consistency checks, etc. Constructing a mempool outside of the "main" path means we might forget to do something that CTxMemPool
doesn't take responsibility for; #24634 essentially stems from this.
My initial reaction to seeing these empty mempools was "why aren't they using node_context.mempool
or chainstate.GetMempool()
?" and the answer was that they need to skirt around the fact that BlockAssembler
will try to fill up the block template using transactions from the mempool.
we could potentially refine the BlockAssembler interface further...
Also concept ACK to this - we have 2 ways of making a block template: (1) manually add the txns we want or (2) use the miner algo to maximize fees, and BlockAssembler
should be responsible for creating both. The latter case (i.e. in generateblock
and CreateBlock
) is currently implemented by manually adding transactions to block.vtx
and then calling RegenerateCommitments()
- a bit of a hack, repeated across multiple tests, and potentially unsafe. It should be replaced with something like BlockAssembler::CreateNewBlock(manual_txns)
which would internally handle the merkle commitments, call TestBlockValidity()
, etc.
One question though - it seems to me that we'd still want mempool in both cases? Why would we manually insert transactions that aren't in our mempool?
Regardless, I agree that making the mempool parameter optional is much better than creating placeholder mempools.
...also adjust callers Changes: - In BlockAssembler::CreateNewBlock, we now only lock m_mempool->cs and call addPackageTxs if m_mempool is not nullptr - BlockAssembler::addPackageTxs now takes in a mempool reference, and is annotated to require that mempool's lock. - In TestChain100Setup::CreateBlock and generateblock, don't construct an empty mempool, just pass in a nullptr for mempool
faddefe
to
0f1a259
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding a boolean to CreateNewBlock
indicating "I don't want any transactions other than the coinbase," instead of making mempool optional? This can later easily be replaced with some txids/txns indicating "please manually add these to the block template." Would it still help you detangle mempool from the argsmanager? I wrote a branch for this approach: glozow@9590c8d
@@ -97,7 +97,7 @@ void Finish(FuzzedDataProvider& fuzzed_data_provider, MockedTxPool& tx_pool, CCh | |||
BlockAssembler::Options options; | |||
options.nBlockMaxWeight = fuzzed_data_provider.ConsumeIntegralInRange(0U, MAX_BLOCK_WEIGHT); | |||
options.blockMinFeeRate = CFeeRate{ConsumeMoney(fuzzed_data_provider, /*max=*/COIN)}; | |||
auto assembler = BlockAssembler{chainstate, *static_cast<CTxMemPool*>(&tx_pool), options}; | |||
auto assembler = BlockAssembler{chainstate, &tx_pool, options}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: why'd you remove the static cast? Not suggesting you add it back, just wondering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just thought it was unnecessary. I think Marco added it in fa03d0a
I don't care either way what we do here!
Talked with Gloria offline, I'm fine with either approach as long as we don't create extraneous
Ping @MarcoFalke @glozow |
Ye thanks for summarizing @dongcarl, also asked @MarcoFalke offline
It seems like the answer to this is yes, and we already rely on this for things like
At the very least, we use |
I think:
About the implementation: I think anything works. You could even go with both approaches at the same time. For example, you could make mempool a pointer to indicate whether to fill the initial block template with txs from the mempool. And you could add another option to pass a list of txs unconditionally added to the initial template. |
ACK 0f1a259 Beyond avoiding the construction of temporary mempools, I was confused about why we'd ever use |
Many thanks for the insightful comments Marco and Gloria! Lots to refer back to in future convos for sure. Glad that the current approach is good to move forward with! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
ACK 0f1a259 🐊
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
ACK 0f1a259657280afc727db97689512aef5ca928fc 🐊
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUhBFAv+Mu3PolWzpLYwd2nrk8Zpq6v9Jw7/TPIRCG2BgmUs0DQ5dht7aqKhiFEp
PiY8mbyM4A9XRHTZcJ1Up+G0tG+o3iMnE5z+9l6AnDRUHZ1N9j0OpxXMq6/ZwvuN
y5CNje2QIARf8oytize8w9Y/l+PxfLwSuFdOx0YHxwE3ZcosvVUUTeZO41QrI6Hu
Hhdsutm/qf25xsRTo/cs3ROrnTBI7wBi+VK6wggdyACSkFKQZWfA7mQiPBf3y+Ji
/vKXWD6P1n2tQJxDydKsHQ7r13/y5wKJQjJ5oN+0DgeWFCkbmiLY7r+jH3I1Yt8m
TG1vSj83awiC2SPnyOto1RYyA7AIWO+XArDuxYbFtYneM1dwj00oJmihUfguI47y
+CD26TsV6jCTXD1/l1TCAH6SCBR24tj1GJ/UbYPGlNAdpzS02UPGVGUttOm7wivS
8YVc7iw/CykiNdOAaqtiTbrRxh2QcGjDWyOhORMit7Ft94RJjstbnGVspZdrChK5
4tBjHzqC
=Ha/h
-----END PGP SIGNATURE-----
// guaranteed to fail again, but as a belt-and-suspenders check we put it in | ||
// failedTx and avoid re-evaluation, since the re-evaluation would be using | ||
// cached size/sigops/fee values that are not actually correct. | ||
bool BlockAssembler::SkipMapTxEntry(CTxMemPool::txiter it, indexed_modified_transaction_set& mapModifiedTx, CTxMemPool::setEntries& failedTx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the first commit:
Any reason to not use the same approach from the second commit and make this function static
? I understand that this would require passing the mempool solely to check the lock annotation, but I think this makes the diff smaller, keeps the structure of the file, and avoids making a large function even larger by inlining.
If there are no other reasons I'd prefer to keep this as-is for now and leave restructuring to changes that actually need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was looking at it, this function just seemed so intertwined with the rest of addPackageTxs
. Both mapModifiedTx
and failedTx
are local variables to addPackageTxs
and without the function abstraction we can also drop the AssertLockHeld
. I agree that addPackageTxs
is quite large and we could break it down into pieces where appropriate but I'm not sure that this is the right place.
If we were to make it static we'd have to pass in the mempool and inBlock.
Code review ACK 0f1a259 |
@@ -300,17 +290,17 @@ void BlockAssembler::SortForBlock(const CTxMemPool::setEntries& package, std::ve | |||
// Each time through the loop, we compare the best transaction in | |||
// mapModifiedTxs with the next transaction in the mempool to decide what | |||
// transaction package to work on next. | |||
void BlockAssembler::addPackageTxs(int& nPackagesSelected, int& nDescendantsUpdated) | |||
void BlockAssembler::addPackageTxs(const CTxMemPool& mempool, int& nPackagesSelected, int& nDescendantsUpdated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BlockAssembler::addPackageTxs
function now has access two different mempool variables: a mempool
argument, and a this->m_mempool
member variable. Right now these point to the same mempool, and the current addPackageTxs
code only refers to mempool
instead of m_mempool
after this PR. But there is no enforcement for these things, so they can easily change and bugs could show up with no one noticing.
IMO, either this new mempool
argument should be dropped and function should go back to using m_mempool
, or the m_mempool
member should be dropped so it is guaranteed that function is only using mempool
. But having multiple aliases for no reason is confusing and asking for bugs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the project issue TODO list: #24303
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the function could be made static
and all needed members be passed in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean honestly all the "helper functions for addPackageTxs()" listed in the header could be made static, also AddToBlock
, and then inBlock
could be a local var in addPackageTxs
. That would minimize the BlockAssembler
interface quite nicely. Lots of shuffling around tho.
d273e53 bench/rpc_mempool: Create ChainTestingSetup, use its CTxMemPool (Carl Dong) 020caba bench: Use existing CTxMemPool in TestingSetup (Carl Dong) 86e732d scripted-diff: test: Use CTxMemPool in TestingSetup (Carl Dong) 213457e test/policyestimator: Use ChainTestingSetup's CTxMemPool (Carl Dong) 319f0ce rest/getutxos: Don't construct empty mempool (Carl Dong) 03574b9 tree-wide: clang-format CTxMemPool references (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 This PR reduces the number of call sites where we explicitly construct CTxMemPool. This is done in preparation for later PRs which decouple the mempool module from `ArgsManager`, eventually all of libbitcoinkernel will be decoupled from `ArgsManager`. The changes in this PR: - Allows us to have less code churn as we modify `CTxMemPool`'s constructor in later PRs - In many cases, we can make use of existing `CTxMemPool` instances, getting rid of extraneous constructions - In other cases, we construct a `ChainTestingSetup` and use the `CTxMemPool` there, so that we can rely on the logic in `setup_common` to set things up correctly ## Notes for Reviewers ### A note on using existing mempools When evaluating whether or not it's appropriate to use an existing mempool in a `*TestingSetup` struct, the key is to make sure that the mempool has the same lifetime as the `*TestingSetup` struct. Example 1: In [`src/fuzz/tx_pool.cpp`](https://github.com/bitcoin/bitcoin/blob/b4f686952a60bbadc7ed2250651d0d6af0959f4d/src/test/fuzz/tx_pool.cpp), the `TestingSetup` is initialized in `initialize_tx_pool` and lives as a static global, while the `CTxMemPool` is in the `tx_pool_standard` fuzz target, meaning that each time the `tx_pool_standard` fuzz target gets run, a new `CTxMemPool` is created. If we were to use the static global `TestingSetup`'s CTxMemPool we might run into problems since its `CTxMemPool` will carry state between subsequent runs. This is why we don't modify `src/fuzz/tx_pool.cpp` in this PR. Example 2: In [`src/bench/mempool_eviction.cpp`](https://github.com/bitcoin/bitcoin/blob/b4f686952a60bbadc7ed2250651d0d6af0959f4d/src/bench/mempool_eviction.cpp), we see that the `TestingSetup` is in the same scope as the constructed `CTxMemPool`, so it is safe to use its `CTxMemPool`. ### A note on checking `CTxMemPool` ctor call sites After the "tree-wide: clang-format CTxMemPool references" commit, you can find all `CTxMemPool` ctor call sites with the following command: ```sh git grep -E -e 'make_unique<CTxMemPool>' \ -e '\bCTxMemPool\s+[^({;]+[({]' \ -e '\bCTxMemPool\s+[^;]+;' \ -e '\bnew\s+CTxMemPool\b' ``` At the end of the PR, you will find that there are still quite a few call sites that we can seemingly get rid of: ```sh $ git grep -E -e 'make_unique<CTxMemPool>' -e '\bCTxMemPool\s+[^({;]+[({]' -e '\bCTxMemPool\s+[^;]+;' -e '\bnew\s+CTxMemPool\b' # rearranged for easier explication src/init.cpp: node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio); src/test/util/setup_common.cpp: m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1); src/rpc/mining.cpp: CTxMemPool empty_mempool; src/test/util/setup_common.cpp: CTxMemPool empty_pool; src/bench/mempool_stress.cpp: CTxMemPool pool; src/bench/mempool_stress.cpp: CTxMemPool pool; src/test/fuzz/rbf.cpp: CTxMemPool pool; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/validation_load_mempool.cpp: CTxMemPool pool{}; src/txmempool.h: /** Create a new CTxMemPool. ``` Let's break them down one by one: ``` src/init.cpp: node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio); src/test/util/setup_common.cpp: m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1); ``` Necessary ----- ``` src/rpc/mining.cpp: CTxMemPool empty_mempool; src/test/util/setup_common.cpp: CTxMemPool empty_pool; ``` These are fixed in #25223 where we stop requiring the `BlockAssembler` to have a `CTxMemPool` if it's not going to consult it anyway (as is the case in these two call sites) ----- ``` src/bench/mempool_stress.cpp: CTxMemPool pool; src/bench/mempool_stress.cpp: CTxMemPool pool; ``` Fixed in #24927. ----- ``` src/test/fuzz/rbf.cpp: CTxMemPool pool; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/validation_load_mempool.cpp: CTxMemPool pool{}; ``` These are all cases where we don't want the `CTxMemPool` state to persist between runs, see the previous section "A note on using existing mempools" ----- ``` src/txmempool.h: /** Create a new CTxMemPool. ``` It's a comment (someone link me to a grep that understands syntax plz thx) ACKs for top commit: laanwj: Code review ACK d273e53 Tree-SHA512: c4ff3d23217a7cc4a7145defc7b901725073ef73bcac3a252ed75f672c87e98ca0368d1d8c3f606b5b49f641e7d8387d26ef802141b650b215876f191fb6d5f9
This is part of the libbitcoinkernel project: #24303, https://github.com/bitcoin/bitcoin/projects/18
This is NOT dependent on, but is a "companion-PR" to #25215.
Abstract
This PR removes the need to construct
BlockAssembler
with temporary, empty mempools in cases where we don't want to source transactions from the mempool (e.g. inTestChain100Setup::CreateBlock
andgenerateblock
). After this PR,BlockAssembler
will accept aCTxMemPool
pointer and handle thenullptr
case instead of requiring aCTxMemPool
reference.An overview of the changes is best seen in the changes in the header file:
Alternatives
Aside from approach in this current PR, we can also take the approach of moving the
CTxMemPool*
argument from theBlockAssembler
constructor toBlockAssembler::CreateNewBlock
, since that's where it's needed anyway. I did not push this approach because it requires quite a lot of call sites to be changed. However, I do have it coded up and can do that if people express a strong preference. This would look something like:Future work
Although wholly out of scope for this PR, we could potentially refine the
BlockAssembler
interface further, so that we have:Whereby
TestChain100Setup::CreateBlock
andgenerateblock
would call theBlockAssembler::CreateNewBlock
that takes inCTransaction
s and we can potentially removeRegenerateCommitments
altogether. All other callers can use theCTxMemPool
version.