-
Notifications
You must be signed in to change notification settings - Fork 37.7k
net, cli: use v2transport for manual/addrfetch connections, add to -netinfo #29058
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Playing devil's advocate: could there be any noticeable drawbacks of always trying with v2 first? I'm thinking of destination nodes behind firewalls with deep packet inspection which block ips from further connections if they don't follow the protocol (i.e. for bitcoin v1 connections, initiating with a version message). If that's the case, the user would only have the option to disable v2transport in general, or use the addnode
RPC instead. Not sure how realistic that "i want to connect via v1 explicitly, as connecting via v2 first causes trouble"-scenario is at all though.
peer.is_outbound ? "out" : "in", | ||
ConnectionTypeForNetinfo(peer.conn_type), | ||
peer.network, | ||
peer.transport_protocol_type == "detecting" ? "*" : peer.transport_protocol_type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe '?' would also work to express the "detecting" state. But no hard feelings, typically users wouldn't see this a lot anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I actually changed it from ? to * last thing before opening the PR, because * was already used in a few other places for netinfo and ? would be a new symbol. But maybe ? is more natural here, @jonatack do you have an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonatack do you have an opinion?
Am catching up with merged pulls and discussions; this one is high on my list when I've caught up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonatack do you have an opinion?
Would suggest display nothing during peer setup, like for the -netinfo mping and ping columns (*
already has a separate meaning in -netinfo, and ?
might look like there is a bug)
Also on this line would suggest to avoid checking for the full string "detecting", and instead do the cheaper check for the most frequent case of the string starting with "v".
Finally, would propose to drop displaying the "v" prefix in all the rows, as it doesn't add useful information, and instead use "v" for the column header.
Proposed in #29212.
Concept ACK |
cd88bb3
to
a30a1d9
Compare
I don't know either if this scenario is realistic, and, as you point out, there is the always the workaround of using the addnode rpc. I guess in general it's less nice to add options to |
@theStack I think the philosophy behind v2 was always "prevent the firewall from discriminating authenticated connections by making them look exactly like non-authenticated (but also encrypted) connections". From BIP:
So along that thinking, the situation you describe should trigger the user to be angry at their firewall (unless that becomes a widely used practice due to some wide regulation, which is only a theoretical, and i think unlikely, outcome in the future). Otherwise, if we encourage bending under such firewalls, the v2 battle is kinda lost anyway... |
@@ -2437,6 +2439,7 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect) | |||
} | |||
if (!interruptNet.sleep_for(std::chrono::milliseconds(500))) | |||
return; | |||
PerformReconnections(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the peer is v2
, but disconnected us due to other reason? (say a limit of inbounds, or us coming from a blocked network)? Here and in the logic below. In that case we should not attempt reconnection with v1
, it won't help.
This is probably about ShouldReconnectV1
behavior. I can't find where this case is handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ShouldReconnectV1
(which is called in DisconnectNode()
) will result in the peer only being put on the list of reconnections if it's in RecvState::KEY
. That is impossible if they ever sent us something, in which case PerformReconnections()
will do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that they haven't sent us anything (the state is KEY
), but disconnected us for other reason than not supporting v2
? I assume it's possible if an alternative client implements whatever because this is not prohibited by any bips... But even in the current client, say that IsBanned
is triggered? (in that case there is no real use in trying v2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, and I could confirm this with two of my nodes: If we connect via v2 peer and are disconnected because they have banned us, we would try once again with v1 (and then get disconnected again). Both for automatic connections, and (with this PR) manual ones. Although for manual ones it arguably probably matters less because we'll continuously retry with -connect
and -addnode
cli args anyway, even when everything is v1.
I don't really see a good way to avoid this, it's not like they notify us why they have disconnected us, so both situations would look the same for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so for this PR i'd just suggest clarifying the newly added comments like
// Attempt v2 connection if we support v2 - if our peer doesn't support it, we'll reconnect with v1.
Right now they are not accurate, as if the banning thing (or some other disconnection reason) don't exist.
If you feel like it, might as well touch the existing comments as well (say around ShouldReconnectV1
) :)
In future, we might communicate the disconnection reason (Ban, v1, eviction, protocol violation, or whatever else), probably through a new p2p message. Right now I'm not 100% sure the benefits are worthy of code complexity :) Better p2p logging (on the disconnected end) may be even better goal than saving up one useless v1/v2 reconnection.
Curious what you think about this idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I clarified the comments added in this PR.
I think that a single connection / reconnection attempt doesn't waste a meaningful amount of resources of either side, so I'm not sure if there is an actual problem here. Also, often one might not want the other side to know that you evicted them because they are banned (after all you banned them for a reason) - that might just help them evade the ban.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a single connection / reconnection attempt doesn't waste a meaningful amount of resources of either side
Note that every I2P connection requires building 2 tunnels, one for each direction. These are meant to be long-lived connections, as they take network resources and waiting for tunnels to be built for every peer connection adds delay to connection setup time.
Also worth noting that while this PR deals with manual connections, we absolutely depend on the reconnection mechanism for automatic connections too: |
I agree, this should only be done if there is a really good reason for it. @naumenkogs: Note that my concerns were not about the v2 approach in general, and I'm not worried at all about the reconnection mechanism. I was merely wondering whether connecting via v1 directly could still be demanded for some users during the next few releases until v2 is more widespread. In these rare cases (if they exist at all), they could of course always either turn off I guess one could argue that it's a bit inconsistent/confusing to allow explicit v1 connections for the |
utACK a30a1d9 Conceptually, I believe this makes sense both now and eventually:
Before enabling |
This affects manual connections made either with -connect, or with -addnode provided as a bitcoind config arg (the addnode RPC has an extra option for v2). We don't necessarily know if our peer supports v2, but will reconnect with v1 if they don't. In order to do that, improve the reconnection behavior such that we will reconnect after a sleep of 500ms (which usually should be enough for our peer to send us their version message).
a30a1d9
to
fb5bfed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested ACK fb5bfed. addrfetch + manual connections aren't frequent and it would be useful to have this for transition to v2 one day.
used -connect
, -addnode
, -seednode
and observed reconnections happening if other peer was v1 and no reconnection if other peer was v2. an interesting behaviour which i realised made sense was to keep retrying with v2 when latency issue failures happen on tor.
2024-01-04T06:14:58Z [net:debug] trying v2 connection "xyz" lastseen=0.0hrs
2024-01-04T06:14:58Z [net] SOCKS5 connecting "xyz"
2024-01-04T06:14:58Z Socks5() connect to "xyz":8333 failed: general failure
2024-01-04T06:14:59Z [net:debug] trying v2 connection "xyz" lastseen=0.0hrs
2024-01-04T06:14:59Z [net] SOCKS5 connecting "xyz"
2024-01-04T06:14:59Z Socks5() connect to "xyz":8333 failed: general failure
2024-01-04T06:15:00Z [net:debug] trying v2 connection "xyz" lastseen=0.0hrs
2024-01-04T06:15:00Z [net] SOCKS5 connecting "xyz"
2024-01-04T06:15:00Z Socks5() connect to "xyz":8333 failed: general failure
Yes! The connection must have been established on the TCP/IP level for a reconnection attempt to happen. It's the same reason why we don't try to reconnect with v1 if the peer is just offline. |
utACK fb5bfed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fb5bfed
Reviewed the code and verified the intended behavior for manual connections (-connect
, -addnode
) and addrfetch connections (-seednode
) by specifying both a p2p-v2-supporting peer and a p2p-v1 peer in different bitcoind runs. For the v1 peer, the first (v2) connection attempt fails, and a successful v1 reconnect happens after, just as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fb5bfed
Not an objection, but this feels backwards to me: presumably the user has manually listed those peers as addnode/connect because it's important that their node be connected to those peers, but if there are any bugs in the "v2 didn't work, retry as v1" logic, it's exactly those peers that will be inaccessible. But if we're confident there aren't bugs in that logic, why not just apply it to all peers? The "if you don't like it, just set -v2transport=0" approach to avoiding bugs/risk works just as well in either case. (Doing it for addrfetch otoh seems fine) |
Do you mean why not apply it to automatic peers as well per default (and ignore the information from the service flag)? I thought the idea was to save one disconnection / reconnection, and the time spent doing these when we probably know better. But as I wrote in #29058 (comment), I think that we depend on the reconnection mechanism for automatic connection as well, because anyone can add a wrong "v2" service flag to a given v1 node in most people's addrman (but not vice versa!). |
ACK fb5bfed |
@@ -517,10 +518,11 @@ class NetinfoRequestHandler : public BaseRequestHandler | |||
const std::string addr{peer["addr"].get_str()}; | |||
const std::string age{conn_time == 0 ? "" : ToString((time_now - conn_time) / 60)}; | |||
const std::string sub_version{peer["subver"].get_str()}; | |||
const std::string transport{peer["transport_protocol_type"].get_str()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fb5bfed This will cause -netinfo
to break when calling it on a node that is running pre-v26 bitcoind, as getpeerinfo
doesn't yet return a "transport_protocol_type" field. It just needs an IsNull()
check, as done in the next line and for a few other newer fields added after -netinfo
.
Edit: addressed in #29212.
Reviewed fb5bfed; WIP review of the other two commits. |
5fa7460 Fix -netinfo backward compat with getpeerinfo pre-v26 (Jon Atack) Pull request description: Commit fb5bfed in #29058 will cause `-netinfo` to break when calling it on a node that is running pre-v26 bitcoind, as `getpeerinfo` doesn't yet return a "transport_protocol_type" field. Fix this by adding an `IsNull()` check, as already done for other recent getpeerinfo fields, and also in the same commit: a) avoid checking for the full string "detecting", and instead do the cheaper check for the most frequent case of the string starting with "v" b) drop displaying the "v" prefix in all the rows, as it doesn't add useful information, and instead use "v" for the column header c) display nothing when a value isn't determined yet, like for the -netinfo mping and ping columns (as `*` already has a separate meaning in this dashboard, and `?` might look like there is a bug) ACKs for top commit: mzumsande: Code Review ACK 5fa7460 achow101: ACK 5fa7460 kristapsk: ACK 5fa7460 Tree-SHA512: 4afc513dc669b95037180008eb4c57fc0a0d742c02f24b236562d6b8daad5c120eb1ce0d90e51696e0f9b8361a72fc930c0b64f04902cf96fb48c8e042e58624
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post-merge review ACK, pending digging a bit deeper into the IRL effects.
CAddress addr; | ||
CSemaphoreGrant grant(*semOutbound, /*fTry=*/true); | ||
if (grant) { | ||
OpenNetworkConnection(addr, false, std::move(grant), strDest.c_str(), ConnectionType::ADDR_FETCH, /*use_v2transport=*/false); | ||
OpenNetworkConnection(addr, false, std::move(grant), strDest.c_str(), ConnectionType::ADDR_FETCH, use_v2transport); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9eed22e nit, code unneeded until/unless there is a grant
@@ -2316,12 +2316,12 @@ void CConnman::ProcessAddrFetch()
strDest = m_addr_fetches.front();
m_addr_fetches.pop_front();
}
- // Attempt v2 connection if we support v2 - we'll reconnect with v1 if our
- // peer doesn't support it or immediately disconnects us for another reason.
- const bool use_v2transport(GetLocalServices() & NODE_P2P_V2);
- CAddress addr;
CSemaphoreGrant grant(*semOutbound, /*fTry=*/true);
if (grant) {
+ // Attempt v2 connection if we support v2 - we'll reconnect with v1 if our
+ // peer doesn't support it or immediately disconnects us for another reason.
+ const bool use_v2transport(GetLocalServices() & NODE_P2P_V2);
+ CAddress addr;
OpenNetworkConnection(addr, false, std::move(grant), strDest.c_str(), ConnectionType::ADDR_FETCH, use_v2transport);
}
}
@@ -2437,6 +2439,7 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect) | |||
} | |||
if (!interruptNet.sleep_for(std::chrono::milliseconds(500))) | |||
return; | |||
PerformReconnections(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a single connection / reconnection attempt doesn't waste a meaningful amount of resources of either side
Note that every I2P connection requires building 2 tunnels, one for each direction. These are meant to be long-lived connections, as they take network resources and waiting for tunnels to be built for every peer connection adds delay to connection setup time.
3ba815b Make v2transport default for addnode RPC when enabled (Pieter Wuille) Pull request description: Since #29058, several types of manually configured connections will attempt v2 connections when `-v2transport` is enabled, except for the `addnode` RPC, as that one has an explicit argument to enable or disable. Make the default for that RPC match the `-v2transport` setting so the behavior matches that of other manual connections from a user perspective. ACKs for top commit: achow101: ACK 3ba815b kristapsk: ACK 3ba815b theStack: Code-review ACK 3ba815b Tree-SHA512: 31ef48cf1e533abb17866020378c004df929e626074dc98b3229fb60a66de58435e95c8fda8d1b463e1208aa39d1f42d239818e7e58595a3738089920598befc
0a53361 docs: add release notes for #27114 (brunoerg) e6b8f19 test: add coverage for whitelisting manual connections (brunoerg) c985eb8 test: add option to speed up tx relay/mempool sync (brunoerg) 66bc6e2 Accept "in" and "out" flags to -whitelist to allow whitelisting manual connections (Luke Dashjr) 8e06be3 net_processing: Move extra service flag into InitializeNode (Luke Dashjr) 9133fd6 net: Move `NetPermissionFlags::Implicit` verification to `AddWhitelistPermissionFlags` (Luke Dashjr) 2863d7d net: store `-whitelist{force}relay` values in `CConnman` (brunoerg) Pull request description: Revives #17167. It allows whitelisting manual connections. Fixes #9923 Since there are some PRs/issues around this topic, I'll list some motivations/comments for whitelisting outbound connections from them: - Speed-up tx relay/mempool sync for testing purposes (my personal motivation for this) - In #26970, theStack pointed out that we whitelist peers to speed up tx relay for fast mempool synchronization, however, since it applies only for inbound connections and considering the topology `node0 <--- node1 <---- node2 <--- ... <-- nodeN`, if a tx is submitted from any node other than node0, the mempool synchronization can take quite long. - #29058 (comment) - "Before enabling -v2transport by default (which I'd image may happen after #24748) we could consider a way to force manual connections to be only-v1 or even only-v2 (disabling reconnect-with-v1). A possibility could be through a net permission flag, if #27114 makes it in." - #17167 (comment) - "This would allow us to use #25355 when making outgoing connections to all nodes, except to whitelisted ones for which we would use our persistent I2P address." - Force-relay/mempool permissions for a node you intentionally connected to. ACKs for top commit: achow101: ACK 0a53361 sr-gi: re-ACK [0a53361](0a53361) pinheadmz: ACK 0a53361 Tree-SHA512: 97a79bb854110da04540897d2619eda409d829016aafdf1825ab5515334b0b42ef82f33cd41587af235b3af6ddcec3f2905ca038b5ab22e4c8a03d34f27aebe1
This affects manual connections made either with -connect, or with -addnode provided as a bitcoind config arg (the addnode RPC has an extra option for v2). We don't necessarily know if our peer supports v2, but will reconnect with v1 if they don't. In order to do that, improve the reconnection behavior such that we will reconnect after a sleep of 500ms (which usually should be enough for our peer to send us their version message). Github-Pull: bitcoin#29058 Rebased-From: 770c031
Github-Pull: bitcoin#29058 Rebased-From: 9eed22e
Github-Pull: bitcoin#29058 Rebased-From: fb5bfed
, bitcoin#28849, bitcoin#28805, bitcoin#28951, bitcoin#29058, bitcoin#29239, partial bitcoin#28331, bitcoin#28452 (BIP324 backports: part 2) 5dd60c4 merge bitcoin#29239: Make v2transport default for addnode RPC when enabled (Kittywhiskers Van Gogh) b2ac426 merge bitcoin#29058: use v2transport for manual/addrfetch connections, add to -netinfo (Kittywhiskers Van Gogh) 92e862a merge bitcoin#28951: damage ciphertext/aad in full byte range (Kittywhiskers Van Gogh) 4e96e26 merge bitcoin#28805: Make existing functional tests compatible with --v2transport (Kittywhiskers Van Gogh) 9371e2e merge bitcoin#28849: fix node index bug when comparing peerinfo (Kittywhiskers Van Gogh) 400c9dd merge bitcoin#28634: add check for missing garbage terminator detection (Kittywhiskers Van Gogh) 65eb194 merge bitcoin#28588: add checks for v1 prefix matching / wrong network magic detection (Kittywhiskers Van Gogh) 6074974 merge bitcoin#28577: raise V1_PREFIX_LEN from 12 to 16 (Kittywhiskers Van Gogh) ff92d1a partial bitcoin#28331: BIP324 integration (Kittywhiskers Van Gogh) f9f8805 fix: drop `CConnman::mapNodesWithDataToSend`, use transport data (UdjinM6) d39d8a4 partial bitcoin#28452: Do not use std::vector = {} to release memory (Kittywhiskers Van Gogh) Pull request description: ## Additional Information * Dependent on #6280 * Dependent on #6276 * Dependency for #6329 * [bitcoin#28452](bitcoin#28452) was backported as the behavior it introduces is required for backports like [bitcoin#28331](bitcoin#28331). As it pre-dates earlier BIP324 backports, the `ClearShrink` usage from those backports have also been incorporated into this backport. * When backporting [bitcoin#28331](bitcoin#28331), in TestFramework's `add_nodes()`, `extra_args[i]` is not converted to a `list` like it is done upstream and avoiding duplication of `-v2transport=1` is instead done by an additional check. This is done to prevent test failures in `feature_mnehf.py`. * The local variable `args` is removed in [bitcoin#28805](bitcoin#28805) as it does nothing (the argument appending logic is removed as part of the backport) and upstream removes it anyways. Special thanks to UdjinM6 for significant contributions to this and dependent PRs! 🎉 ## Breaking Changes None expected. ## Checklist - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)** - [x] I have added or updated relevant unit/integration/functional/e2e tests - [x] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: UdjinM6: light ACK 5dd60c4 PastaPastaPasta: utACK 5dd60c4 Tree-SHA512: 7f3d0e54e1c96fc99b2d6b1e64485110aa73aeec0e4e4245ed4e6dc0529a7608e9c39103af636d5945d289775c40d3d15d7d4a75bea82462194dbf355fca48dc
Some preparations before enabling
-v2transport
as the default:-connect
,-addnode
config arg and-seednode
if-v2transport
is enabled.Our peer may or may not support v2, but I don't think an extra option is necessary for any of these (we have that for the
addnode
rpc), because we have the reconnection mechanism that will try again withv1
if our peer doesn't supportv2
.-netinfo
. I added it next to thenet
column because I thought it looked nice there, but if people prefer it somewhere else I'm happy to move it.