-
Notifications
You must be signed in to change notification settings - Fork 37.7k
bug fix: valid but different LockPoints after a reorg #23683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @MarcoFalke, thanks for catching my bug 🐛 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Left some questions on the test.
48d1b46
to
bd716bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too bad this can't properly be tested on regtest. I guess the only way is via a pre-mined main-chain?
Concept ACK bd716bf 🙆
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Concept ACK bd716bf774d21956ec7ae616d06a2a896568ed18 🙆
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjr2gwAqMWws2MpZHhtGe0UDOPIk0PKWrc/Xbg991hVwlMwmHv2ZdgTm0aeYa15
zI0/9PVfKF7P1mTn5iWjJGaoAzsWPyRQAO7imfKwiYOUQnqBBP8s4qpjZxAyE+y4
n9CtBkoDxJd6FBY3/PaC+A/TQCXqrfoefLXMqkCjkFl5aF85YRyAvB69xpKDwMp0
KJLqSga6ThHVkHJOXPoD09yjCEOkT/2K7z2XMTyzGMkzmTBJnhBaLd50Lfy31Ioj
EQz4G7wEskfJkQBATIDzvy0nbwbH38hL30GEVolasAYLUdlxnRQr1QnZEnEUEoV1
MrdzGtQYuBIUyEoaqJyvL/WVb+WEPy1XL/R3VmHf8vWbhAJK2W8RkGW1auE7WZwP
W8HubXwO/dqmXqGd2A9QUaRN1RtYRKL7QwA5ffym++tJQRtpqT4/htuFH4BaIfv/
+Z4d4NPkP8ySa1fhwWvMvLf2JM+5V9572LPInHka1B7t9z9DRjVCN3fb/nMOBzUp
A2NkaSsm
=9LGj
-----END PGP SIGNATURE-----
During a reorg, we re-check timelocks on all mempool entries using CheckSequenceLocks(useExistingLockPoints=false) and remove any now-invalid entries. CheckSequenceLocks() also mutates the LockPoints passed in, and we update valid entries' LockPoints using update_lock_points. Thus, update_lock_points(lp) needs to be called right after CheckSequenceLocks(lp), otherwise we lose the data in lp. commit bedf246 introduced a bug by separating those two loops.
bd716bf
to
28b60ce
Compare
utACK 28b60ce The bug is pretty scary, good thing we found it this soon, and the fix is straightforward. |
Ah no. I don't think this is possible to test at all. The lockpoints are just a cache and if it is outdated, evaluating it will always return false, in which case the lockpoints are re-evaluated. So this is not fixing a behaviour bug, but potentially improving performance a bit? |
The "natural reorg to a new chain that is shorter (but has more work)" is probably not possible without significant changes to regtest chainparams, I think. Since this PR adds assertions for the lockpoints, making sure they're updated is testable. And true, I suppose outdated lockpoints are not that bad, though now I'm wondering if it's worth adding the assertions if it's relatively harmless. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 28b60ce
The updates look good to me, and they seem to fix the bug introduced in #22677.
I was able to run the test successfully on the master branch. Thanks for fixing this quickly, @glozow!
p.s. While I ran all the tests on the PR branch using test/functional/test_runner.py
the following tests failed:
feature_coinstatsindex.py | ✖ Failed | 2 s
feature_rbf.py --legacy-wallet | ✖ Failed | 2 s
interface_bitcoin_cli.py | ✖ Failed | 3 s
wallet_transactiontime_rescan.py | ✖ Failed | 6 s
Though it is not probably related to this PR. I still wanted to bring it to other contributors' attention. I shall also try running these tests on the master branch to check if the error persists.
Update: The same tests fails with the master too (with top commit d20d6ac). I am not sure what the reason is, though.
I was hoping there was a way to test this on current master without modifying the code or looking at the process internal memory, but yeah... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK a64078e
Yes, I don't think this is a bug introducing invalid transactions in the mempool. Just a perf regression. If you have a reorg from chain H to chain H'' and the lockpoints are not valid anymore on H'', the cache is not going be updated accordingly. Let's say you have one more reorg from chain H'' to chain H'''. If H''' includes the max So if this reasoning is correct, I also agree that caching lockpoints is a perf improvement, in the sense they save few coins fetches and block index walkes ? |
How did you catch this? It looks pretty obscure to me. |
Through code review, basically. Though, it was a bit tricky because there was never an externally observable logic bug, at most the performance difference. For a few hours I even thought the code could be removed (#23660) 😅 |
Change looks good to me. I agree it fixes the lockpoint invalidity bug introduced in #22677. A few suggestions for the
I've included a suggested patch below. Feel free to take whatever you think is an improvement. Suggested patchdiff --git a/src/txmempool.cpp b/src/txmempool.cpp
index 26cb2ff4b4..c742548875 100644
--- a/src/txmempool.cpp
+++ b/src/txmempool.cpp
@@ -631,7 +631,7 @@ void CTxMemPool::removeForReorg(CChain& chain, std::function<bool(txiter)> check
setEntries txToRemove;
for (indexed_transaction_set::const_iterator it = mapTx.begin(); it != mapTx.end(); it++) {
- if (check_final_and_mature(it)) txToRemove.insert(it);
+ if (!check_final_and_mature(it)) txToRemove.insert(it);
}
setEntries setAllRemoves;
for (txiter it : txToRemove) {
diff --git a/src/validation.cpp b/src/validation.cpp
index f53641e786..8da2e80510 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -350,21 +350,38 @@ void CChainState::MaybeUpdateMempoolForReorg(
// the disconnectpool that were added back and cleans up the mempool state.
m_mempool->UpdateTransactionsFromBlock(vHashUpdate);
- const auto check_final_and_mature = [this, flags=STANDARD_LOCKTIME_VERIFY_FLAGS](CTxMemPool::txiter it)
+ // Check whether the transaction is still final/mature, and update
+ // the transaction's cached lockpoints if necessary.
+ // If this returns false, the mempool *must* remove the transaction, since
+ // it is no longer valid for inclusion in the next block (and its cached
+ // lockpoints may be invalid).
+ const auto check_final_and_mature = [this, flags=STANDARD_LOCKTIME_VERIFY_FLAGS](CTxMemPool::txiter it) -> bool
EXCLUSIVE_LOCKS_REQUIRED(m_mempool->cs, ::cs_main) {
- bool should_remove = false;
AssertLockHeld(m_mempool->cs);
AssertLockHeld(::cs_main);
const CTransaction& tx = it->GetTx();
+
+ // Check transaction finality
+ if (!CheckFinalTx(m_chain.Tip(), tx, flags)) {
+ return false;
+ }
+
+ // Check sequence locks
LockPoints lp = it->GetLockPoints();
const bool validLP{TestLockPointValidity(m_chain, lp)};
CCoinsViewMemPool view_mempool(&CoinsTip(), *m_mempool);
- if (!CheckFinalTx(m_chain.Tip(), tx, flags)
- || !CheckSequenceLocks(m_chain.Tip(), view_mempool, tx, flags, &lp, validLP)) {
- // Note if CheckSequenceLocks fails the LockPoints may still be invalid
- // So it's critical that we remove the tx and not depend on the LockPoints.
- should_remove = true;
- } else if (it->GetSpendsCoinbase()) {
+
+ if (!CheckSequenceLocks(m_chain.Tip(), view_mempool, tx, flags, &lp, validLP)) {
+ // Sequence locks fail. The mempool *must* remove this transaction since
+ // we're not going to update its cached lockpoints.
+ return false;
+ } else if (!validLP) {
+ // CheckSequenceLocks has modified lp. Update the cached lockpoints.
+ m_mempool->mapTx.modify(it, update_lock_points(lp));
+ }
+
+ // Check coinbase spend maturity
+ if (it->GetSpendsCoinbase()) {
for (const CTxIn& txin : tx.vin) {
auto it2 = m_mempool->mapTx.find(txin.prevout.hash);
if (it2 != m_mempool->mapTx.end())
@@ -373,18 +390,19 @@ void CChainState::MaybeUpdateMempoolForReorg(
assert(!coin.IsSpent());
const auto mempool_spend_height{m_chain.Tip()->nHeight + 1};
if (coin.IsSpent() || (coin.IsCoinBase() && mempool_spend_height - coin.nHeight < COINBASE_MATURITY)) {
- should_remove = true;
- break;
+ // Coinbase spend is now immature. Remove tx from mempool.
+ return false;
}
}
}
- // CheckSequenceLocks updates lp. Update the mempool entry LockPoints.
- if (!validLP) m_mempool->mapTx.modify(it, update_lock_points(lp));
- return should_remove;
+
+ // Transaction is still final and mature. Cached lockpoints have been updated.
+ return true;
};
- // We also need to remove any now-immature transactions
+ // Remove any transactions that are no longer final/mature, and update cached lockpoints.
m_mempool->removeForReorg(m_chain, check_final_and_mature);
+
// Re-limit mempool size, in case we added any transactions
LimitMempoolSize(
*m_mempool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b4adc5a
Notice - the ACKed commit is the HEAD~
of this PR, without the topmost commit which adds a test which I do not think is harmful and I do not object. But I don't see what value it adds because:
- It passes with and without the fix
- If an assert is added to the code to make the test fail, then
mempool_reorg.py
fails even without the extension added in this PR - The extension from this PR does not increase the code coverage
This is not an objection to the added test snippet, rather ~0 on it. I am happy if it gathers enough ACKs from other devs and gets merged.
28b60ce
to
a69c75d
Compare
I've removed the test. Perhaps a followup dedicated to test improvements can convert the
Thanks, I've applied (most of) your diff. I see what you mean by the return results being unexpected for something called "check." My wish is to follow the convention of a unary predicate that returns |
a69c75d
to
2586226
Compare
2586226
to
b4adc5a
Compare
Sorry for the churn. I had wanted to incorporate the suggestions in #23683 (comment) for the lambda, but I now think it's best saved for a different PR. |
I think it's very important that the comments are updated to clarify what this function is doing. A bug was introduced in this code because the code was unclear, so we should take the opportunity to clarify it to prevent a similar regression being added again. This branch does fix the recent regression, but I think it's significantly worse than the previous version since good tests and clarifying code comments have been removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review ACK b4adc5a
I agree that this fixes the "bug", and verified that mempool_reorg.py
fails if only the assert from b4adc5a is added but not the fix.
Can't comment on any of the older versions/tests - I think that the refactoring/doc proposed by @jnewbery looks sensible, but could also be a follow-up PR.
Commit message nit:
During a reorg, we re-check timelocks on all mempool entries using
CheckSequenceLocks(useExistingLockPoints=false) and remove any
now-invalid entries.
That sounds as if we call CheckSequenceLocks(useExistingLockPoints=false)
for all entries, but we call TestLockPointValidity()
for all mempool entries, and call
CheckSequenceLocks(useExistingLockPoints=false)
only for those with invalid lockpoints.
Concept ACK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b4adc5a
ACK b4adc5a But please do update the comments and add a test in a follow-up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK b4adc5a 🏁
Only changes:
- Remove commit that adds test (bd716bf)
- Remove useless doxygen comment on struct update_lock_points
- Expand the commit message of commit b4adc5a "[bugfix] update lockpoints correctly during reorg"
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
re-ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b 🏁
Only changes:
* Remove commit that adds test (bd716bf774)
* Remove useless doxygen comment on struct update_lock_points
* Expand the commit message of commit b4adc5ad67 "[bugfix] update lockpoints correctly during reorg"
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjyJgwAiIcxIULHGXbeJdpJrr/yTOIL9za7KmSpiwuFZgB5alGaxZHVwNXz5tMs
JgREgzSCQyGFl6otzezJKQ4ZwZ3+m6Xkjlop9DGQNgVXlRPs0hAUwePLj0c1a0Y7
XpJaCS8U3Ovi/av/eTXBaAAVZ/tH/ngqiSo1Y7E41e4c12IGtE4JlgXaaI8+vaC8
4oF29XugN/FWuoY8WXqXGHwnRyxyhid9QR7uLfmQC+0tcEUW2bu9yDH+hWuFeNs1
8gu6mb76OuMqRQR/PRA7Jx+vN2WlN9eCua+TFVWvxS9vl06jOOrz2Yw4qC8gisgz
J5rzAH8y5aMnztO9I2//smKojsxtierk2dIe2BXBGsgoo8GqxsbLocCHuTrvlt56
eBJuUZLFYc3G5WtxD55cc6WpPGMVJJ/5sfptnYYNiaaca6bIYk1dmBklAQDdeyGg
GtwDfvZIbYFMU2Aq+cAta59oo7Xtw9tSWS6Kp1sKJOQxQfhB39XBxjgc5qwZ7L1z
hSX5amzs
=W5Xo
-----END PGP SIGNATURE-----
…reorg b4adc5a [bugfix] update lockpoints correctly during reorg (glozow) b6002b0 MOVEONLY: update_lock_points to txmempool.h (glozow) Pull request description: I introduced a bug in bitcoin#22677 (sorry! 😅) Mempool entries cache `LockPoints`, containing the first height/blockhash/`CBlockIndex*` at which the transaction becomes valid. During a reorg, we re-check timelocks on all mempool entries using `CheckSequenceLocks(useExistingLockPoints=false)` and remove any now-invalid entries. `CheckSequenceLocks()` also mutates the `LockPoints` passed in, and we update valid entries' `LockPoints` using `update_lock_points`. Thus, `update_lock_points(lp)` needs to be called right after `CheckSequenceLocks(lp)`, otherwise we lose the data in `lp`. I incorrectly assumed they could be called in separate loops. The incorrect behavior introduced is: if we have a reorg in which a timelocked mempool transaction is still valid but becomes valid at a different block, the cached `LockPoints` will be incorrect. This PR fixes the bug, adds a test, and adds an assertion at the end of `removeForReorg()` to check that all mempool entries' lockpoints are valid. You can reproduce the bug by running the test added in the [test] commit on the code before the [bugfix] commit. ACKs for top commit: jnewbery: ACK b4adc5a vasild: ACK b4adc5a mzumsande: Code Review ACK b4adc5a hebasto: ACK b4adc5a MarcoFalke: re-ACK b4adc5a 🏁 Tree-SHA512: 16b59f6ff8140d0229079ca1c6b04f2f4a00a2e49931275150e4f3fe5ac4ec109698b083fa6b223ba9511f328271cc1ab081263669d5da020af7fee83c13e401
…ForReorg e177fca Replace `struct update_lock_points` with lambda (glozow) c7cd98c document and clean up MaybeUpdateMempoolForReorg (glozow) Pull request description: followup to #23683, addressing bitcoin/bitcoin#23683 (comment) ACKs for top commit: hebasto: ACK e177fca, I have reviewed the code and it looks OK, I agree it can be merged. instagibbs: ACK e177fca MarcoFalke: Approach ACK e177fca 😶 Tree-SHA512: 8c2709dd5cab73cde41f3e5655d5f237bacfb341f78eac026169be579528695ca628c8777b7d89760d8677a4e6786913293681cfe16ab702b30c909703e1824c
Summary: This is a partial backport of [[bitcoin/bitcoin#22677 | core#22677]], [[bitcoin/bitcoin#23649 | core#23649]] and [[bitcoin/bitcoin#23683 | core#23683]] bitcoin/bitcoin@bedf246 bitcoin/bitcoin@b01784f bitcoin/bitcoin@b4adc5a Note that [[bitcoin/bitcoin#23683 | core#23683]] mostly reverts the commit from [[bitcoin/bitcoin#22677 | core#22677]]. The remaining diff after applying both commits consists of just an additional assertion and comment. Also note that core#23683 was applied out of sequence to avoid introducing a bug. In the source material, the assertion is added to the code after a piece of it was moved to validation.cpp. Depends on D12440 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D12441
Summary: The lockpoints are not changed in this function. There is no reason to pass a pointer. Note for review: this is backported out of order after [[bitcoin/bitcoin#23683 | core#23683]] This concludes backport of [[bitcoin/bitcoin#23649 | core#23649]] bitcoin/bitcoin@c4efc4d Depends on D12448 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D12453
Summary: This concludes backport of [[bitcoin/bitcoin#23683 | core#23683]] bitcoin/bitcoin@b6002b0 Depends on D12453 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D12454
…m `CheckSequenceLocksAtTip()` 75db62b refactor: Move calculation logic out from `CheckSequenceLocksAtTip()` (Hennadii Stepanov) 3bc434f refactor: Add `CalculateLockPointsAtTip()` function (Hennadii Stepanov) Pull request description: This PR is follow up for bitcoin/bitcoin#22677 and bitcoin/bitcoin#23683. On master (013daed) it is not obvious that `CheckSequenceLocksAtTip()` function can modify its `LockPoints* lp` parameter which leads to bitcoin/bitcoin#22677 (comment). This PR: - separates the lockpoint calculate logic from `CheckSequenceLocksAtTip()` function into a new `CalculateLockPointsAtTip()` one - cleans up the `CheckSequenceLocksAtTip()` function interface - makes code easier to reason about (hopefully) ACKs for top commit: achow101: ACK 75db62b stickies-v: re-ACK 75db62b Tree-SHA512: 072c3fd9cd1e1b0e0bfc8960a67b01c80a9f16d6778f374b6944ade03a020415ce8b8ab2593b0f5e787059c8cf90af798290b4c826785d41955092f6e12e7486
I introduced a bug in #22677 (sorry! 😅)
Mempool entries cache
LockPoints
, containing the first height/blockhash/CBlockIndex*
at which the transaction becomes valid. During a reorg, we re-check timelocks on all mempool entries usingCheckSequenceLocks(useExistingLockPoints=false)
and remove any now-invalid entries.CheckSequenceLocks()
also mutates theLockPoints
passed in, and we update valid entries'LockPoints
usingupdate_lock_points
. Thus,update_lock_points(lp)
needs to be called right afterCheckSequenceLocks(lp)
, otherwise we lose the data inlp
. I incorrectly assumed they could be called in separate loops.The incorrect behavior introduced is: if we have a reorg in which a timelocked mempool transaction is still valid but becomes valid at a different block, the cached
LockPoints
will be incorrect.This PR fixes the bug, adds a test, and adds an assertion at the end of
removeForReorg()
to check that all mempool entries' lockpoints are valid. You can reproduce the bug by running the test added in the [test] commit on the code before the [bugfix] commit.