-
Notifications
You must be signed in to change notification settings - Fork 37.8k
net processing: Move orphan reprocessing to a global #19364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
70501cf
to
b957b0d
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
@@ -73,6 +73,14 @@ class PeerLogicValidation final : public CValidationInterface, public NetEventsI | |||
*/ | |||
bool SendMessages(CNode* pto) override EXCLUSIVE_LOCKS_REQUIRED(pto->cs_sendProcessing); | |||
|
|||
/** | |||
* Process global tasks that aren't attached to a specific peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we should clarify that these tasks are expected to be limited to 1 second or something (perhaps "lightweight tasks")?
Just so that future devs don't tempt to put any heavy logic in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such documentation, if it's added, should be at the level of the PeerLogicValidation
class. We don't want anything in the message handler thread to block for a long time (or to be blocked by slow tasks in other threads). That includes anything in ProcessMessages()
, SendMessages()
and (now) ProcessGlobalTasks()
.
Concept ACK. |
Does this new global data structure need to be limited in size in some way? At least when it was per peer, it would go away when the peer was disconnected. I'm vaguely worried that this might open up a new DoS vector. |
That's the right question to be asking, but I think we're safe:
These entries can't persist for long in the set. I could refactor Longer term, it might make sense to consolidate all orphan logic into a module, similar to how #19184 overhauls request logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b957b0d modulo below
Looks to me that this PR is a very natural extension of #15644. (It's maybe in the same line of thinking as this comment there) Naively this global approach feels better.
In regards to the addition of ProcessGlobalTasks, it seems reasonable to me but I don't think I have the background to honestly Concept ACK it. (i.e. I can't fully grasp the impact it will have on future development). Looks like others have, so no issue here.
If I can give it a shot, the suspected DoS vector is something along the lines of a peer feeding us fake orphan transactions, therefore inflating the size of g_orphan_work_set
and preventing us from processing any other orphaned transactions? Seems like once that peer is dealt with, g_orphan_work_set
will be cleared relatively quickly.
src/net_processing.h
Outdated
* @param[in] interrupt Interrupt condition for processing threads | ||
* @returns bool true if there's more work to do | ||
*/ | ||
bool ProcessGlobalTasks(std::atomic<bool>& interrupt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compiler warning -Winconsistent-missing-override
bool ProcessGlobalTasks(std::atomic<bool>& interrupt); | |
bool ProcessGlobalTasks(std::atomic<bool>& interrupt) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Fixed!
ProcessGlobalTasks does general work that isn't attached to a specific peer.
Orphan reprocessing should not be tied to the peer that provided the parent. Move orphan_work_set to be a global and process it in ProcessGlobalTasks()
We reprocess orphan transactions on every MessageHandler thread, so there's no need to call it in the context of an individual peer.
I think @laanwj's concern may have been in general about us storing unvalidated data from a peer, which is always something that we need to be careful about. It's not so much of a concern here because the size of I have a branch here: https://github.com/jnewbery/bitcoin/tree/2020-06-global-orphans2 that refactors |
b957b0d
to
9487f08
Compare
Concept ACK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea! This LGTM.
I wonder if there may have been a reason why it was implemented this way in #15644 / 866c805, so it may be good to have review by the author of the code.
To keep the commits hygienic, consider removing the call to ProcessOrphanTx
from ProcessMessage
in the same commit that adds the new call to it from ProcessGlobalTasks
, e.g. squashing the last two commits.
If you retouch, might be good to fixup s/ProcessMessages/ProcessMessage/
in the commit message, as both function names exist in the file.
Consider adding here the Doxygen comments commit from your branch.
If this moves forward, big concept ACK on the changes in https://github.com/jnewbery/bitcoin/commits/2020-06-global-orphans2.
@@ -2832,7 +2833,7 @@ void ProcessMessage( | |||
auto it_by_prev = mapOrphanTransactionsByPrev.find(COutPoint(inv.hash, i)); | |||
if (it_by_prev != mapOrphanTransactionsByPrev.end()) { | |||
for (const auto& elem : it_by_prev->second) { | |||
pfrom.orphan_work_set.insert(elem->first); | |||
g_orphan_work_set.insert(elem->first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9200126 nit: would it be better to use emplace
here to construct rather than copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emplace wins when you have to construct a temporary in place which is then copied/moved to the container. When you already have the object in hand (as we do here), it makes no difference - the object is just copied into the container.
See #15644 (comment)
I think that'd be doing too much in the same commit, but I'm happy to squash if other reviewers agree.
I did intend to put ProcessMessages. ProcessMessage is called from ProcessMessages, and ProcessMessages is the entry point into net processing for the message handler thread, so from a high level, it makes sense to say that his change is moving orphan reconsideration from ProcessMessages into ProcessGlobalTasks.
I think that can stay in the follow-up PR, but again, I'm happy to add it here if other reviewers think it makes more sense to add it here. |
reACK 9487f08 Looking forward to the follow-up! |
It took me a while to remember the context here, but this PR causes a subtle but significant change in P2P behavior. The reason that there is a per-peer set of to-be-processed potential orphans is that this lets us pause processing for that peer until they're all processed - as long as the orphan work set for a peer is non-empty, all that happens for that peer is steps towards emptying that set. This makes the behavior observationally equivalent (from the perspective of any single peer) to recursively processing all orphans immediately as soon as the "missing link" transaction is received. With this change that is no longer the case - P2P messages from the peer that provides the missing link can and will be processed and responded to before processing the dependent orphans. I'm not sure that anything depends on that behavior, but I also can't guarantee that there isn't. I don't think code cleanup is a sufficient justification for such a fundamental P2P change. |
For context, this is part of a wider project to move all application-layer data and processing into net processing. The PR that introduced I can't think of any problems this might cause. If a peer announces a tx that conflicts with an orphan and also provides the parent, then in master, the orphan will be accepted into the mempool, but in this branch the conflicting tx may be accepted first. I don't think that's an issue since in the case of two conflicting txs we never guarantee which one will make it into the mempool. Are there other scenarios we should be considering? Generally, if we're worried about behaviour changes we try to reason about them and satisfy ourselves that they're safe, and include tests if we have any doubts. I don't see why this PR should be different. |
Lots of discussion about this on IRC today: http://www.erisian.com.au/bitcoin-core-dev/log-2020-07-02.html L325 onwards. |
FWIW here is a commit that moves It also makes |
This PR intended to do two things:
I think both changes are good. If everyone agreed, I don't think it'd be a problem to do both in this PR, but since this is controversial and @sipa has concerns that (1) is correct, it makes sense to split them out and consider them separately. The branch at https://github.com/sipa/bitcoin/commits/202007_orphan_work_set_to_np seems fine, but I hope we'll soon have a |
001343f ProcessOrphanTx: Move AddToCompactExtraTransactions call into ProcessOrphanTx (John Newbery) 4fce726 ProcessOrphanTx: Remove aliases (John Newbery) e07c5d9 ProcessOrphanTx: Remove outdated commented (John Newbery) 4763b51 ProcessOrphanTx: remove useless setMisbehaving set (John Newbery) 55c79a9 ProcessOrphanTx: remove useless done variable (John Newbery) 6e8dd99 [net processing] Add doxygen comments for orphan data and function (John Newbery) Pull request description: Originally a follow-up to #19364, this simplifies the logic in ProcessOrphanTx() and removes unused variables. ACKs for top commit: troygiorshev: ACK 001343f sipa: utACK 001343f MarcoFalke: ACK 001343f 🌮 Tree-SHA512: be558457f2e08ebb6bddcd49bdd75bd410c3650da44a76c688fc9f07822f94d5a1af93fa1342678052b2c8163cdb9745c352c7884325ab0a41fa593c3eb89116
001343f ProcessOrphanTx: Move AddToCompactExtraTransactions call into ProcessOrphanTx (John Newbery) 4fce726 ProcessOrphanTx: Remove aliases (John Newbery) e07c5d9 ProcessOrphanTx: Remove outdated commented (John Newbery) 4763b51 ProcessOrphanTx: remove useless setMisbehaving set (John Newbery) 55c79a9 ProcessOrphanTx: remove useless done variable (John Newbery) 6e8dd99 [net processing] Add doxygen comments for orphan data and function (John Newbery) Pull request description: Originally a follow-up to bitcoin#19364, this simplifies the logic in ProcessOrphanTx() and removes unused variables. ACKs for top commit: troygiorshev: ACK 001343f sipa: utACK 001343f MarcoFalke: ACK 001343f 🌮 Tree-SHA512: be558457f2e08ebb6bddcd49bdd75bd410c3650da44a76c688fc9f07822f94d5a1af93fa1342678052b2c8163cdb9745c352c7884325ab0a41fa593c3eb89116
Currently we keep a per-peer list of orphans to reconsider after one of their parents is accepted. There's no reason this list should be connected to the peer that provided the parent, so instead make it a global. Process that list in a separate call into net processing that does general work rather than in the context of a specific peer.