-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Detect and ignore transactions that were CPFP'd in the fee estimator #25380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @glozow @instagibbs @naumenkogs since you Concept ACK'd #23074. This is implementing #23074 (comment). |
seems reasonable to not take signals that are obviously bogus, will give more thinking |
@@ -115,13 +115,13 @@ def check_estimates(node, fees_seen): | |||
check_smart_estimates(node, fees_seen) | |||
|
|||
|
|||
def send_tx(wallet, node, utxo, feerate): | |||
def send_tx(wallet, node, feerate, utxo=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer just passing None
as utxo
explicitly, as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. What do you think of this PR conceptually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like a hack that would be better off identifying CPFP from the block contents itself. Possibly even using it for the estimates, if practical. But at least this should be better than nothing.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it's better not to feed inaccurate data to our fee estimator. Do we have a preference for underestimation vs overestimation of feerates? It's probably bad to include both sponsors and sponsees in CPFP situations?
Ideally, one day we come up with a way to accurately record CPFP feerates in the fee estimator. Not to scope creep - I don't know how we could do it properly right now and I think this PR is better than having inaccurate estimates.
I think it's always better to be slightly overestimate than to slightly under-estimate, as the latter can result in frustration for the regular onchain user or degraded service quality for professionals (for instance stuck transactions can create a "shortage" of confirmed UTxOs or too long mempool chains). As far as i can tell from discussions and the algorithm it's always been the opinion to err on the conservative side.
Sponsees aren't included since transactions with mempool dependencies don't get considered for fee estimation. And i don't think we can do that until we find a proper way of considering packages in the fee estimator. |
e928f4c
to
200a49f
Compare
Ah, my bad! Seems like we're definitely underestimating in CPFP situations right now, then, since we include the parents but not the child.
I agree, it seems safer to sometimes pay a little extra than to miss your confirmation target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's better to overestimate rather than underestimate, especially if you've have time-sensitive safety confirmations or offering block inclusion SLA to your withdrawals. I still wonder if we can't come up with a better heuristic.
Concept ACK. The inaccuracy seems real. Since we're targeting a simple heuristic, there will be downsides. The risk is overestimating and paying extra I guess. Otherwise, it would help if @darosior summarized the alternatives suggested in couple sentences (or helping in navigating the design scope e.g. "what could be possibly done"). As for the current code, I think it's an improvement, but I think we indeed need more eyes/opinions. |
Concept ACK Did some light review, looks reasonable, would review again when rebased |
200a49f
to
5ae45ad
Compare
Thanks everyone for the interest, finally reviving this. Rebased and modernized the code (use static casts and named parameters, make use of batched RPC calls after #26656).
We'd also need historical snapshots of a mempool to do that. If someone has got some and is willing to test, i'd be very interested in seeing the results.
The design space is actually quite narrow. Short of tracking packages, we can only get rid of invalid data points. I aim for the detection logic to be straightforward and conservative: better let a few invalid data points still go through rather than drop some valid ones.
It was suggested by @mzumsande in #25380 (comment) we may be able to get rid of the first heuristic, effectively only checking if a transaction we are tracking was mined along with a child transaction that has a higher feerate. I don't think it is correct, as the child transaction's effective feerate may be lower than the parent's. For instance:
In this diagram, Another thing that could be done here is to take into account "second-order" CPFP. That is, if the child of the child of a transaction we are tracking for fee estimation is bumping. For instance (same as above +
In this diagram we would not be detecting neither Finally, i'm considering being more strict on the first heuristic. We could for instance require that the child is at least a 10% increase in feerate to be considered a CPFP. Because currently a parent would be considered to have been CPFP'd if it has a child with an individual feerate |
Concept ACK
I have historical snapshots and would be happy to help run the test. Although, since running the test is not trivial, I wouldn't make a blocker for the PR. |
Please correct me if I am wrong on this. The
That means we are not just disregarding a some amount of fee rate statistics (all transactions with in mempool parent), We are tracking the stat with inaccurate feerate. This PR is trying to ensure that the ancestors of a transaction that is assumed to be CPFP should not be recorded in the fee estimator statistics as confirmed. If CPFP adoption continues to increase, the data that On Master e25af11 If we want to further improve fee estimation e.g. opting into the approach in #23074 comment or add heavy-calculation steps to With this #28368 we can now conveniently do that and try to add transactions to the fee estimator by ancestor feerate using #23074 comment approach or similar. Idk why @DrahdBot did not update but this PR conflicts with #28368 as updating the fee estimator from the new interface does not have details to detect the heuristic in this PR. |
The intent of this PR is to prevent a transaction tracked in the fee estimator to be assumed as being mined thanks to solely its own feerate when it's not. Technically it's equivalent to what you describe.
The fee estimator would fail if it doesn't have enough data points to give a reasonable estimation. Unfortunately right now in the situation you describe it would return an incorrectly low estimate instead for the reason mentioned above.
I'm skeptical we could improve the fee estimator to consider packages without a significant rewrite. The current algorithm tracks the time between arrival in our mempool and inclusion in a block. With a package you don't have a single arrival time anymore. And it's unclear how a replacement of part of the package would affect our estimates. Here is an example. A 4-txs package with a 10sats/vb feerate arrives at block height There is certainly an argument for restricting ourselves to certain kinds of packages, or to packages that are relayed once (assuming package relay otherwise it's still more complex) and never change. This way we could adapt our fee estimation framework to what's being used on the network and maybe adapt it as usage patterns change. To be fair this seems to me closer to a research project than something we could implement anytime soon, but maybe i'm overly pessimistic. |
DrahtBot doesn't detect silent merge conflicts afaik. But yeah, this PR uses the vector of entries to detect cpfp-ness in
I agree I don't see any simple solution for package-aware fee estimation... fwiw I was just brainstorming in that comment and it has many of the same inaccuracy problems. Perhaps with cluster mempool, we can notify the fee estimator each time a transaction's miner score changes (which is a lot of signals but should encompass package-aware scores + timing)? I think this PR is the right thing to do for now. |
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable bitcoin#25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 12e3641
send_tx(self.wallet, node, u, high_feerate) | ||
self.sync_mempools(wait=0.1, nodes=[node, miner]) | ||
self.generate(miner, 1) | ||
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == 0 | |
assert_equal(node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"], 0) |
send_tx(self.wallet, node, u, low_feerate) | ||
self.sync_mempools(wait=0.1, nodes=[node, miner]) | ||
self.generate(miner, 1) | ||
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == 1 | |
assert_equal(node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"], 1) |
self.sync_mempools(wait=0.1, nodes=[node, miner]) | ||
self.generate(miner, 1) | ||
# Decay of 0.962, truncated to 2 decimals in the RPC result | ||
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == Decimal("1.96") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == Decimal("1.96") | |
assert_equal(node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"], Decimal("1.96")) |
node.batch(batch_send_tx) | ||
self.sync_mempools(wait=0.1, nodes=[node, miner]) | ||
self.generate(miner, 1) | ||
assert node.estimatesmartfee(2)["feerate"] == med_feerate * 1000 / COIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
assert node.estimatesmartfee(2)["feerate"] == med_feerate * 1000 / COIN | |
assert_equal(node.estimatesmartfee(2)["feerate"], (med_feerate * 1000 / COIN)) |
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable bitcoin#25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable 25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable 25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
I will review |
if (individual_feerate > ancestor_score) { | ||
for (const CTxMemPoolEntry& parent : entry->GetMemPoolParents()) { | ||
CFeeRate parent_feerate{parent.GetFee(), static_cast<uint32_t>(parent.GetTxSize())}; | ||
if (parent_feerate < individual_feerate) _removeTx(parent.GetTx().GetHash(), /* inBlock = */true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had a chain:
A->B-C
2 1 100 sat/vbyte
Would this mean that A
would be considered as not a CPFP by this logic?
if there's a limit to this logic it should be made clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This code would only catch the most simple one-parent one-child CPFPs. Do you think we should do better? I'm a bit wary of starting to consider the child as then we'd have to consider its own ancestor feerate to avoid false positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the limitation, but the commit message/docs should be very explicit what's been ignored by this commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to consider 3+ chains as usual, without dropping the child. (CPFP implementations use RBF, right?)
Although I guess it won't be worse than today anyway, assuming someone really wants to trigger poor estimation on purpose?
I agree this should be better documented.
# score, it does not get taken into account by the fee estimator. | ||
tx = send_tx(self.wallet, node, None, low_feerate) | ||
u = {"txid": tx["txid"], "vout": 0, "value": Decimal(tx["tx"].vout[0].nValue) / COIN} | ||
send_tx(self.wallet, node, u, high_feerate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use named args
# If it has descendants which have a lower ancestor score, it does though. | ||
tx = send_tx(self.wallet, node, None, high_feerate) | ||
u = {"txid": tx["txid"], "vout": 0, "value": Decimal(tx["tx"].vout[0].nValue) / COIN} | ||
send_tx(self.wallet, node, u, low_feerate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use named args
self.sync_mempools(wait=0.1, nodes=[node, miner]) | ||
self.generate(miner, 1) | ||
# Decay of 0.962, truncated to 2 decimals in the RPC result | ||
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == Decimal("1.96") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems very brittle, can you add a formula to recalculate this if something changes, or use the formula directly here?
# Decay of 0.962, truncated to 2 decimals in the RPC result | ||
assert node.estimaterawfee(1)["short"]["fail"]["totalconfirmed"] == Decimal("1.96") | ||
|
||
# Generate and mine packages of transactions, 80% of them are a [low fee, high fee] package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make some of these comments a log where appropriate
# Assert that we don't give the low feerate as estimate, assuming the low fee transactions | ||
# got mined on their own. | ||
for _ in range(4): | ||
txs = [] # Batch the RPCs calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is batching special for some reason?
if (individual_feerate > ancestor_score) { | ||
for (const CTxMemPoolEntry& parent : entry->GetMemPoolParents()) { | ||
CFeeRate parent_feerate{parent.GetFee(), static_cast<uint32_t>(parent.GetTxSize())}; | ||
if (parent_feerate < individual_feerate) _removeTx(parent.GetTx().GetHash(), /* inBlock = */true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stared at the result of _removeTx
for some time. false
would mean it's already been removed by another child, which is fine in the current implementation but might be confusing later (if not documented).
if (individual_feerate > ancestor_score) { | ||
for (const CTxMemPoolEntry& parent : entry->GetMemPoolParents()) { | ||
CFeeRate parent_feerate{parent.GetFee(), static_cast<uint32_t>(parent.GetTxSize())}; | ||
if (parent_feerate < individual_feerate) _removeTx(parent.GetTx().GetHash(), /* inBlock = */true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to consider 3+ chains as usual, without dropping the child. (CPFP implementations use RBF, right?)
Although I guess it won't be worse than today anyway, assuming someone really wants to trigger poor estimation on purpose?
I agree this should be better documented.
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable 25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
…onnectedBlock` parameter Add `parents`, `nSizeWithAncestors`, and `nModFeesWithAncestors` entries to `NewMempoolTransactionInfo`. This commit then changes `MempoolTransactionsRemovedForConnectedBlock` parameter `txs_removed_for_block` type to `NewMempoolTransactionInfo`. This added info will enable 25380 even after this, vice versa. assumed as being mined thanks to solely its own feerate when it's not". see bitcoin#25380 (comment)
Needs rebase, if still relevant |
🤔 There hasn't been much activity lately and the CI seems to be failing. If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in. |
Let's face it: looks like i'm not getting back to this. I think it's still a bit concerning, but hey i'm focused on other things right now and other people are looking at fee estimation. Let's mark this as up for brags.
Inspired by @willcl-ark |
@ismaelsadeeq you may be interested in picking this up. |
PIcked up in #30079. |
The fee estimator currently considers the fee of a transaction in isolation (and doesn't consider transactions with unconfirmed parents). This could hold well when it was just a few outliers. But as CPFP is becoming common practice whether it is among wallets (#23074 has a few instances) or protocols (for instance the LN with anchor outputs), i'm afraid that if fees rise this can lead to significantly underestimated results.
See the first commit's message for details and #23074 for history.
This PR is a much simpler alternative to #23074. The point is to fix the issue before we come up with a reasonable way to track package feerates in the fee estimator.
The functional test is in a separate commit so reviewers can more easily cherry-pick it on master and see the fee estimator would return the wrong estimate.