-
Notifications
You must be signed in to change notification settings - Fork 37.7k
net: fix remaining net assertions #9609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(fixes #9212)
…On January 21, 2017 1:54:26 AM EST, Cory Fields ***@***.***> wrote:
First change is from @TheBlueMatt.
4629656
added assertions to ensure that we never send out any messages before
version negotiation has completed, but a few issues remain.
In cases where the version message failed to deserialize fully, we can
end up with only some of the vars set, leading to sending messages that
shouldn't be sent.
Additionally, the ForEach* functions allow the caller to send messages
to all nodes, including those that aren't yet fully connected. To
prevent that, filter out nodes that definitely shouldn't be
sending/processing messages.
As a follow-up, I'd like to remove the assert, as I think it's done its
job, but I'll save that discussion for another PR because I think
@TheBlueMatt mentioned that he'd prefer to keep it.
You can view, comment on, or merge this pull request online at:
#9609
-- Commit Summary --
* Dont deserialize nVersion into CNode, should fix #9212
* net: deserialize the entire version message locally
* net: don't run callbacks on nodes that haven't completed the version
handshake
-- File Changes --
M src/net.cpp (7)
M src/net.h (38)
M src/net_processing.cpp (54)
-- Patch Links --
https://github.com/bitcoin/bitcoin/pull/9609.patch
https://github.com/bitcoin/bitcoin/pull/9609.diff
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#9609
|
src/net_processing.cpp
Outdated
} | ||
if (!vRecv.empty()) { | ||
vRecv >> pfrom->nStartingHeight; | ||
vRecv >> nStartingHeight; | ||
} | ||
{ | ||
LOCK(pfrom->cs_filter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block no longer accesses a cs_filter protected variable. The lock and braces should be removed.
src/net_processing.cpp
Outdated
pfrom->strSubVer = strSubVer; | ||
pfrom->cleanSubVer = SanitizeString(strSubVer); | ||
pfrom->nStartingHeight = nStartingHeight; | ||
pfrom->fRelayTxes = fRelay; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was protected by cs_filter above.
src/net_processing.cpp
Outdated
|
||
// Change version | ||
pfrom->SetSendVersion(nSendVersion); | ||
pfrom->nVersion = nVersion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nVersion should be made atomic.
@gmaxwell ACK all of those, will fix. |
src/net_processing.cpp
Outdated
@@ -1278,11 +1291,7 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv, | |||
UpdatePreferredDownload(pfrom, State(pfrom->GetId())); | |||
} | |||
|
|||
// Change version | |||
connman.PushMessage(pfrom, CNetMsgMaker(INIT_PROTO_VERSION).Make(NetMsgType::VERACK)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set succesfully connected and nversion before sending VERACK is it possible that some other thread will send some other message to this peer before VERACK has been sent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and that's ok. It's seen on the current network.
What's not ok is that messages before VERACK could possibly end up being sent with the upgraded version. So i suppose setting the sendversion should stay as-is, after the VERACK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, that doesn't work, it just reintroduces the problem that we're attempting to solve.
A mutex for a group of these attributes may be unavoidable. I'll poke at it some more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the setsetversion can be run on based on nversion variable. So there is no reason it can't be done early with the nversion set later, as far as I can tell.
Ok, I think this is solid now. |
tenuous utACK I'm also not very familiar with feeler connections but would it make sense to move the feeler check before the fSuccessfullyConnected, so that we set fDisconnect first, and then won't send anything to the feeler peers? Are we meant to be sending addr and GetAddr messages to the feeler peers? |
@morcos yes we should send addr to feelers-- their purpose is to improve knowledge about the node addresses all around. I agree we should probably set Successfully after the disconnect flag is set for them. |
src/net.h
Outdated
@@ -166,7 +166,7 @@ class CConnman | |||
{ | |||
LOCK(cs_vNodes); | |||
for (auto&& node : vNodes) | |||
if(!func(node)) | |||
if (NodeFullyConnected(node) && !func(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just remove unused functions instead of fixing them?
src/net_processing.cpp
Outdated
// Change version | ||
pfrom->SetSendVersion(nSendVersion); | ||
pfrom->nVersion = nVersion; | ||
pfrom->fSuccessfullyConnected = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're gonna be using fSuccessfullyConnected (which I think we should), can we set it in VERACK instead and use it in SendMessages? I looked over where we check nVersion for "are we connected" and it all looks like that would leave it at just ProcessMessage, which is right.
This would mean we at least dont do anything but respond to messages if we haven't gotten the peer's verack yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if someone sends us a VERACK without anything else having happened? :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheBlueMatt That disagrees with my intended future usage of fSuccessfullyConnected (net layer only), but I agree that it makes more sense that way for now. I'll make that change, and we can discuss further post-0.14.
Then we can catch it with our nVersion != 0 check :)
…On January 24, 2017 6:47:38 PM EST, Gregory Maxwell ***@***.***> wrote:
gmaxwell commented on this pull request.
> - if((pfrom->nServices & NODE_WITNESS))
+ pfrom->nServices = nServices;
+ pfrom->addrLocal = addrMe;
+ pfrom->strSubVer = strSubVer;
+ pfrom->cleanSubVer = SanitizeString(strSubVer);
+ pfrom->nStartingHeight = nStartingHeight;
+ pfrom->fClient = !(nServices & NODE_NETWORK);
+ {
+ LOCK(pfrom->cs_filter);
+ pfrom->fRelayTxes = fRelay; // set to true after we get
the first filter* message
+ }
+
+ // Change version
+ pfrom->SetSendVersion(nSendVersion);
+ pfrom->nVersion = nVersion;
+ pfrom->fSuccessfullyConnected = true;
what if someone sends us a VERACK without anything else having
happened? :P
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#9609
|
ACK a79e00b8b251ab3744bd2f1bd48f265c1a7d7912 |
That commit is not part of the repository :) |
It's the head I reviewed and tested however. :P (I wondered if someone would notice.) (it sounds like cfields intends to change this more) |
Fixed up as requested. fSuccessfullyConnected is set last thing when processing VERACK so that we push out SENDCMPCT/SENDHEADERS first. We will now avoid doing anything in SendMessages() until the handshake is complete. @morcos That fixes the feeler issue as well, since after VERSION, no more messages will be processed, and nothing else will go out. @gmaxwell VERACK can't be seen before VERSION, as we refuse to process anything before version has been set. Also removed the unused ForEach* in CConnman as requested by @TheBlueMatt. |
utACK. Will test. You should consider making a brief release note that some broken P2P implementations (that never send VERACK) may stop working. |
@@ -213,36 +213,43 @@ class CConnman | |||
void ForEachNode(Callable&& func) | |||
{ | |||
LOCK(cs_vNodes); | |||
for (auto&& node : vNodes) | |||
func(node); | |||
for (auto&& node : vNodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it was already there, but can we remove the auto here? It doesnt save any space, and during review I always find myself thinking "OK, gonna go grep for CNode to find every use of it in the codebase", which our ever-increasing use of auto breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's conducive to large type changes, though. With this one in particular, when CNode becomes a shared_ptr<CNode>, this code requires no changes.
For exactly that reason, I actually have local changes which switch all loop operations on vNodes to for (auto&& node : vNodes)
:\
Other than my general objection to increased use of auto in the codebase (which I'll probably go through and get upset and rip out soon enough), utACK 47134d7de2ae57fbcb0715922094eca624ccb0fb |
I'm much happier reviewing a massive pile of sed-generated diff than having auto everywhere, but maybe I'm alone... I'll bring it up at the next meeting (because we didn't /just/ have a discussion of style or anything...).
…On January 28, 2017 1:01:29 PM EST, Cory Fields ***@***.***> wrote:
theuni commented on this pull request.
> @@ -213,36 +213,43 @@ class CConnman
void ForEachNode(Callable&& func)
{
LOCK(cs_vNodes);
- for (auto&& node : vNodes)
- func(node);
+ for (auto&& node : vNodes) {
It's conducive to large type changes, though. With this one in
particular, when CNode becomes a shared_ptr\<CNode\>, this code
requires no changes.
For exactly that reason, I actually have local changes which switch
_all_ loop operations on vNodes to ```for (auto&& node : vNodes)``` :\
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#9609
|
// Don't send anything until we get its version message | ||
if (pto->nVersion == 0 || pto->fDisconnect) | ||
// Don't send anything until the version handshake is complete | ||
if (!pto->fSuccessfullyConnected || pto->fDisconnect) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this !NodeFullyConnected(pto) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe because Cory wanted NodeFullyConnected to be a net-specific thing, and not a net_processing thing...its all a bit intermixed at the moment but I believe @theuni wants to fix it up soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, exactly that. I have no problem with changing it for the time-being though if you think it'd be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care, just looked like an obvious omission.
Needs rebase after #9626 |
I'll echo the tenuous utACK (though this needs to be rebased). I spent some time trying to find ways to break this code and wasn't able to come up with anything interesting. I do think that we should remove the asserts for the 0.14 release, but I'm indifferent to whether we keep them around in master. |
I really hate the assertions in the net code. They can and will lead to remotely-triggered DoS attacks. It's just not acceptable to do debugging in this way, sorry. They should be replaced with logging, or maybe an option 'stop on network errors' that needs to be enabled explicitly while debugging network code. But this should never be the default behavior. |
We use assertions in many places to ensure our logic is functioning - here we're asserting that the peer we're accessing has completed the handshake, a pretty important cleanup IMO. Indeed, this assertion is somewhat shit due to the half-cleaned-up state of net at the moment, but I don't think that means we should remove the assertion in 0.15. Let's remove it after the branch and open an issue to revisit before 0.15 release.
…On February 2, 2017 3:50:40 AM EST, "Wladimir J. van der Laan" ***@***.***> wrote:
I really hate the assertions in the net code. They can and will lead to
remotely-triggered DoS attacks. It's just not acceptable to do
debugging in this way, sorry.
They should be replaced with logging, or maybe an option 'stop on
network errors' that needs to be enabled explicitly while debugging
network code.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#9609 (comment)
|
I see the way we use asserts as equivalent to exceptions, but making sure
they cannot be catched. Perhas activating them optionally as wumpus
suggests is interesting.
I really don't understand the reasoning for doing this in 0.14 but not in
master. If it needs to be revisited, fine, it can be done after this
change. I really dislike that version branches differ from master more
than they need, specially when it's for no good reason like it seems is tje
case here. IMO it should be only version change, release notes, etc. And
bug fixes that we were NOT aware of before doing the branch.
…On 2 Feb 2017 15:00, "Matt Corallo" ***@***.***> wrote:
We use assertions in many places to ensure our logic is functioning - here
we're asserting that the peer we're accessing has completed the handshake,
a pretty important cleanup IMO. Indeed, this assertion is somewhat shit due
to the half-cleaned-up state of net at the moment, but I don't think that
means we should remove the assertion in 0.15. Let's remove it after the
branch and open an issue to revisit before 0.15 release.
On February 2, 2017 3:50:40 AM EST, "Wladimir J. van der Laan" <
***@***.***> wrote:
>I really hate the assertions in the net code. They can and will lead to
>remotely-triggered DoS attacks. It's just not acceptable to do
>debugging in this way, sorry.
>
>They should be replaced with logging, or maybe an option 'stop on
>network errors' that needs to be enabled explicitly while debugging
>network code.
>
>--
>You are receiving this because you were mentioned.
>Reply to this email directly or view it on GitHub:
>#9609 (comment)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#9609 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA9jSg6dazVJUu6tZvFzCK6zEbW9N6Q5ks5rYeGYgaJpZM4Lp_La>
.
|
@jtimon the reason for not doing it on master is that the assert, if it fires, actually is an indication of a bug. On the flip side, because of the relative youth of this code (and the fact that such bugs are long-standing issues, not regressions), its probably better to not ship something with the assert in just yet. I dont see any problem having master and 0.14 differ by one line... |
In places where there is no (direct) interface with the outside world it's somewhat more acceptable. At least if there'd be crashes it is not remotely triggerable. It's assertions in net code that I have a problem with.
What I am afraid of is that that will be forgotten and they somehow will make it into a release anyway. I think leaving master even temporarily in a "dangerous" state is very risky. |
@laanwj ok, fair... @sdaftuar points out that its probably time to rethink our assertion infrastructure - get more agressive with assertions when not running a production build and get weaker on a few of them when doing so (and probably dont use assert(), since it is rarely what we want, but maybe log such errors in production with a big fat "PLEASE REPORT THIS" tag). |
(Back in town, sorry for slow response) I agree with removing the assertion. It probably never should have been there, as it introduces the possibility of remote crashers. I must admit that I'm glad that it forced us to fully address this problem, though. I'll rebase and turn the assertion into a log. |
This avoids having some vars set if the version negotiation fails. Also copy it all into CNode at the same site. nVersion and fSuccessfullyConnected are set last, as they are the gates for the other vars. Make them atomic for that reason.
…handshake Since ForEach* are can be used to send messages to all nodes, the caller may end up sending a message before the version handshake is complete. To limit this, filter out these nodes. While we're at it, may as well filter out disconnected nodes as well. Delete unused methods rather than updating them.
This is a change in behavior, though it's much more sane now than before.
Also cleaned up the comments and moved from the header to the .cpp so that logging headers aren't needed from net.h
utACK 08bb6f4 |
utACK 08bb6f4 |
ACK 08bb6f4 |
utACK 08bb6f4, testing with |
08bb6f4 net: log an error rather than asserting if send version is misused (Cory Fields) 7a8c251 net: Disallow sending messages until the version handshake is complete (Cory Fields) 12752af net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields) 2046617 net: deserialize the entire version message locally (Cory Fields) 80ff034 Dont deserialize nVersion into CNode, should fix #9212 (Matt Corallo)
@@ -1199,50 +1199,51 @@ bool static ProcessMessage(CNode* pfrom, std::string strCommand, CDataStream& vR | |||
CAddress addrFrom; | |||
uint64_t nNonce = 1; | |||
uint64_t nServiceInt; | |||
vRecv >> pfrom->nVersion >> nServiceInt >> nTime >> addrMe; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has been in the code forever... is it really needing to be changed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, many of the race conditions have been in the code forever and are only now being discovered and solved.
https://github.com/bitcoin/bitcoin/pull/9609/files#diff-1a8b9d1ad0a6fda5e751286c73102fc2R594 note: here we are cleaning up ForNode as used by the original DASH code as well
e733939 [Cleanup] Remove CConnman::Copy(Release)NodeVector, now unused (random-zebra) b09da57 [Refactor] Proper CConnman encapsulation of mnsync (random-zebra) 219061e [Refactor] Decouple SyncWithNode from CMasternodeSync::Process() (random-zebra) e1f3620 [Refactor] Proper CConnman encapsulation of proposals / votes sync (random-zebra) f38c9f9 miner: don't try to access connman if it doesn't exist (Fuzzbawls) 92ab619 Address feedback (Fuzzbawls) 9d5346a Do not set an addr time penalty when a peer advertises itself. (Gregory Maxwell) fe12ff9 net: No longer send local address in addrMe (Wladimir J. van der Laan) 14d6917 net: misc header cleanups (Cory Fields) 34ba0de net: make proxy receives interruptible (Cory Fields) 1bd97ef net: make net processing interruptable (Fuzzbawls) e24b4cc net: make net interruptible (Cory Fields) 814d0de net: add CThreadInterrupt and InterruptibleSleep (Cory Fields) 1443f2a net: a few small cleanups before replacing boost threads (Cory Fields) 58eabe4 net: move MAX_FEELER_CONNECTIONS into connman (Cory Fields) 874baba Convert ForEachNode* functions to take a templated function argument rather than a std::function to eliminate std::function overhead (Jeremy Rubin) 9485ab0 Made the ForEachNode* functions in src/net.cpp more pragmatic and self documenting (Jeremy Rubin) c1e59ad minor net cleanups (Fuzzbawls) 07ae004 net: move vNodesDisconnected into CConnman (Cory Fields) 276c946 net: add nSendBufferMaxSize/nReceiveFloodSize to CConnection::Options (Cory Fields) 22a9aff net: Introduce CConnection::Options to avoid passing so many params (Cory Fields) e4891bf net: Drop StartNode/StopNode and use CConnman directly (Cory Fields) 431575c net: pass CClientUIInterface into CConnman (Cory Fields) 48de47e net: Pass best block known height into CConnman (Cory Fields) 15eed91 net: move max/max-outbound to CConnman (Cory Fields) 2bf0921 net: move semOutbound to CConnman (Cory Fields) 481929f net: move nLocalServices/nRelevantServices to CConnman (Cory Fields) bcee6ae net: move SendBufferSize/ReceiveFloodSize to CConnman (Cory Fields) 6865469 net: move send/recv statistics to CConnman (Cory Fields) 1cec418 net: SocketSendData returns written size (Cory Fields) 2bb9dfa net: move messageHandlerCondition to CConnman (Cory Fields) 9c5a0df net: move nLocalHostNonce to CConnman (Cory Fields) a1394ef net: move nLastNodeId to CConnman (Cory Fields) 3804c29 net: move whitelist functions into CConnman (Cory Fields) dbde9be net: create generic functor accessors and move vNodes to CConnman (Cory Fields) 2e02467 net: Add most functions needed for vNodes to CConnman (Cory Fields) 5667e61 net: move added node functions to CConnman (Cory Fields) 37487ed net: Add oneshot functions to CConnman (Cory Fields) facf878 net: move ban and addrman functions into CConnman (Cory Fields) 091aaf2 net: handle nodesignals in CConnman (Cory Fields) 1e9fa0f net: move OpenNetworkConnection into CConnman (Cory Fields) 573200f net: Move socket binding into CConnman (Cory Fields) 7762b97 net: Pass CConnection to wallet rather than using the global (Fuzzbawls) 00591b8 net: Pass CConnman around as needed (Cory Fields) 2cd3d39 net: Add rpc error for missing/disabled p2p functionality (Cory Fields) f08e316 net: Create CConnman to encapsulate p2p connections (Cory Fields) 66337dc net: move CBanDB and CAddrDB out of net.h/cpp (Fuzzbawls) 10969f6 gui: add NodeID to the peer table (Cory Fields) 58044ac Fix some locks (Pieter Wuille) f9f8926 Do not shadow variables in networking code (Pavel Janík) Pull request description: This is the finalization of the upstream PRs being tracked in #1374 to update much of our networking code to more closely follow upstream improvements. The following PRs are included here: - bitcoin#8466 Do not shadow variables in networking code - bitcoin#8606 Fix some locks - bitcoin#8085 Begin encapsulation - bitcoin#9289 drop boost::thread_group - bitcoin#8740 No longer send local address in addrMe - bitcoin#8661 Do not set an addr time penalty when a peer advertises itself Additionally, during conflict resolution and backporting of 8085, the following additional upstream PR was included: - bitcoin#8715 only delete CConnman if it's been created Still TODO in future PRs: - bitcoin#9037 - bitcoin#9441 - bitcoin#9609 - bitcoin#9626 - bitcoin#9708 - bitcoin#12381 - bitcoin#18584 ACKs for top commit: furszy: same here, code ACK e733939. Nice work ☕ . furszy: ACK e733939 . random-zebra: ACK e733939 and merging... Tree-SHA512: 0fc3cca76d9ddc13f75fc9d48c48d215e6c0e0381377c0318436176a0e0ead73b511e0061a4b63f7052874730b6da8b0ffc2c94cba034bcc39aad4212b69ee22
30d5c66 net: correct addrman logging (Fuzzbawls) 8a2b7fe Don't send layer2 messages to peers that haven't completed the handshake (Fuzzbawls) dc10100 [bugfix] Making tier two thread interruptable. (furszy) 2ae76aa Move CNode::addrLocal access behind locked accessors (Fuzzbawls) 470482f Move CNode::addrName accesses behind locked accessors (Fuzzbawls) 35365e1 Move [clean|str]SubVer writes/copyStats into a lock (Fuzzbawls) d816a86 Make nServices atomic (Matt Corallo) 8a66add Make nStartingHeight atomic (Matt Corallo) 567c9b5 Avoid copying CNodeStats to make helgrind OK with buggy std::string (Matt Corallo) aea5211 Make nTimeConnected const in CNode (Matt Corallo) cf46680 net: fix a few races. (Fuzzbawls) c916fcf net: add a lock around hSocket (Cory Fields) cc8a93c net: rearrange so that socket accesses can be grouped together (Cory Fields) 6f731dc Do not add to vNodes until fOneShot/fFeeler/fAddNode have been set (Matt Corallo) 07c8d33 Ensure cs_vNodes is held when using the return value from FindNode (Matt Corallo) 110a44b Delete some unused (and broken) functions in CConnman (Matt Corallo) 08a12e0 net: log an error rather than asserting if send version is misused (Cory Fields) cd8b82c net: Disallow sending messages until the version handshake is complete (Fuzzbawls) 54b454b net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields) 2be6877 net: deserialize the entire version message locally (Fuzzbawls) 444f599 Dont deserialize nVersion into CNode (Fuzzbawls) f30f10e net: remove cs_vRecvMsg (Fuzzbawls) 5812f9e net: add a flag to indicate when a node's send buffer is full (Fuzzbawls) 5ec4db2 net: Hardcode protocol sizes and use fixed-size types (Wladimir J. van der Laan) de87ea6 net: Consistent checksum handling (Wladimir J. van der Laan) d4bcd25 net: push only raw data into CConnman (Cory Fields) b79e416 net: add CVectorWriter and CNetMsgMaker (Cory Fields) 63c51d3 net: No need to check individually for disconnection anymore (Cory Fields) 07d8c7b net: don't send any messages before handshake or after fdisconnect (Cory Fields) 9adfc7f net: Set feelers to disconnect at the end of the version message (Cory Fields) f88c06c net: handle version push in InitializeNode (Cory Fields) 04d39c8 net: construct CNodeStates in place (Cory Fields) 40a6c5d net: remove now-unused ssSend and Fuzz (Cory Fields) 681c62d drop the optimistic write counter hack (Cory Fields) 9f939f3 net: switch all callers to connman for pushing messages (Cory Fields) 8f9011d connman is in charge of pushing messages (Cory Fields) f558bb7 serialization: teach serializers variadics (Cory Fields) 01ea667 net: Use deterministic randomness for CNode's nonce, and make it const (Cory Fields) de1ad13 net: constify a few CNode vars to indicate that they're threadsafe (Cory Fields) 34050a3 Move static global randomizer seeds into CConnman (Pieter Wuille) 1ce349f net: add a flag to indicate when a node's process queue is full (Fuzzbawls) 5581b47 net: add a new message queue for the message processor (Fuzzbawls) 701b578 net: rework the way that the messagehandler sleeps (Fuzzbawls) 7e55dbf net: Add a simple function for waking the message handler (Cory Fields) 47ea844 net: record bytes written before notifying the message processor (Cory Fields) ffd4859 net: handle message accounting in ReceiveMsgBytes (Cory Fields) 8cee696 net: log bytes recv/sent per command (Fuzzbawls) 754400e net: set message deserialization version when it's time to deserialize (Fuzzbawls) d2b8e0a net: make CMessageHeader a dumb storage class (Fuzzbawls) cc24eff net: remove redundant max sendbuffer size check (Fuzzbawls) 32ab0c0 net: wait until the node is destroyed to delete its recv buffer (Cory Fields) 6e3f71b net: only disconnect if fDisconnect has been set (Cory Fields) 1b0beb6 net: make GetReceiveFloodSize public (Cory Fields) 229697a net: make vRecvMsg a list so that we can use splice() (Fuzzbawls) d2d71ba net: fix typo causing the wrong receive buffer size (Cory Fields) 50bb09d Add test-before-evict discipline to addrman (Ethan Heilman) Pull request description: This is a combination of multiple upstream PRs focused on optimizing the P2P networking flow after the introduction of CConnman encapsulation, and a few older PRs that were previously missed to support the later optimizations. The PRs are as follows: - bitcoin#9037 - net: Add test-before-evict discipline to addrman - bitcoin#5151 - make CMessageHeader a dumb storage class - bitcoin#6589 - log bytes recv/sent per command - bitcoin#8688 - Move static global randomizer seeds into CConnman - bitcoin#9050 - net: make a few values immutable, and use deterministic randomness for the localnonce - bitcoin#8708 - net: have CConnman handle message sending - bitcoin#9128 - net: Decouple CConnman and message serialization - bitcoin#8822 - net: Consistent checksum handling - bitcoin#9441 - Net: Massive speedup. Net locks overhaul - bitcoin#9609 - net: fix remaining net assertions - bitcoin#9626 - Clean up a few CConnman cs_vNodes/CNode things - bitcoin#9698 - net: fix socket close race - bitcoin#9708 - Clean up all known races/platform-specific UB at the time PR was opened - Excluded bitcoin/bitcoin@512731b and bitcoin/bitcoin@d8f2b8a, to be done in a separate PR ACKs for top commit: furszy: code ACK 30d5c66 , testnet sync from scratch went well and tested with #1829 on top as well and all good. furszy: mainnet sync went fine, ACK 30d5c66 . random-zebra: ACK 30d5c66 and merging... Tree-SHA512: 09689554f53115a45f810b47ff75d887fa9097ea05992a638dbb6055262aeecd82d6ce5aaa2284003399d839b6f2c36f897413da96cfa2cd3b858387c3f752c1
First change is from @TheBlueMatt.
4629656 added assertions to ensure that we never send out any messages before version negotiation has completed, but a few issues remain.
In cases where the version message failed to deserialize fully, we can end up with only some of the vars set, leading to sending messages that shouldn't be sent.
Additionally, the ForEach* functions allow the caller to send messages to all nodes, including those that aren't yet fully connected. To prevent that, filter out nodes that definitely shouldn't be sending/processing messages.
As a follow-up, I'd like to remove the assert, as I think it's done its job, but I'll save that discussion for another PR because I think @TheBlueMatt mentioned that he'd prefer to keep it.