-
Notifications
You must be signed in to change notification settings - Fork 37.7k
p2p: Improve diversification of new connections #19860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: Improve diversification of new connections #19860
Conversation
Draft until #19724 is merged. |
There is an interesting overlap with #19858 (periodic BLOCK_RELAY conns to sync tips). I think those temporarily BLOCK_RELAY conns probably should be selected similarly to the existent new Similarly to the work in this PR, I think we better select them as diverse as we can. If an attacker is capable to influence this selection, making it less diverse (by dropping one of the checks) won't really help neither with the robustness of these selected connections nor with their privacy. |
Do you have a way to model what an attacker's capability would need to be in order to deduce something like this? On the face of it, it seems strange to me make this kind of peering behavior dependent of the order in which we connect peers (full-relay first or block-relay first), and in particular, my presumption would be that it's more important to aim for peer netgroup diversity (as a first-order solution to being eclipsed) than it is to worry about inference based on that diversity (which is a second-order concern with regards to eclipse attacks) -- but if the inference capability is strong then perhaps I'm mistaken in thinking that? |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
@sdaftuar I agree that immediate diversity is more important than theoretical anti-inference. But since it's easy to get confused here, I want to point out that after the second commit:
The difference is only that existing block relay conns don't affect new full relay conns. So, like, just 2 of the existing peers, for now, won't participate in diversity enforcement. If we have more (say 8 block relay conns), we might introduce some randomness: like 4 conns are used to diversify new full conns, and 4 are hidden.
No, not right now. I just assume an attacker can influence what our AddrMan has. The influence shouldn't even be that strong, they just have to force us to connect to a set of (given by them) peers on-demand with some probability. For example, they exploited eviction and made us connect to them (via regular conns) exclusively. Now to eclipse us they have to find out about our block-relay-only conns and potentially disrupt those. May be feasible with these capabilities and enough protocol knowledge :) If this is not convincing, I can drop the commit until when/if I can actually demostrate a feasible attack. |
c63c875
to
6be550c
Compare
Added one more improvement: diversify by MANUAL connections (consider them regular full outbound). The privacy considerations I have for BLOCK_RELAY don't apply here because MANUAL are not particularly private... (since they support full relay) Not sure whether they should be excluded when considering |
6be550c
to
b2e36ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have don't consider not only AddrMan poisoning capabilities but also more powerful ones like connections-interference capabilities. E.g an adversary observing an outbound connection from Alice to Bob and repeatedly opening inbound towards Bob until Alice is dropped. More concretely, if we assume that an attacker can map a victim outbound full-relay peers through tx-relay and manipulate those connections liveliness, a serie of which netgroups have been visited can be established. Observing the missing ones among the publicly available peerspace should hint attacker towards which netgroups are occupied by victim's block-relay slots.
I understand this PR as aiming to provoke netgroup collisions among the set of full-relay and block-relay peers to avoid making observable the absence of it. And thus improving the non-observability of block-relay peers. Of course it has as a side-effect to decrease the global outbound peers diversity as full-relay netgroups may overlap with block-relay ones. I think this is acceptable as the diversity reduction come at the win of making it far harder to disrupt block-relay connections. Previously an attacker should have committed a higher number Sybils spread in more diverse netgroups and could have interfered with victim's honest block-relay to increase odds of its deployed Sybils being selected as block-relay. Now, an attacker may commit a lower number of Sybils spread in a less diverse netgroups but is more restrained to interfere, and thus likely decrease its odds of being selected.
Overall, I think diversity and network topology non-observability are improving eclipse-attack resistance on different axis. As of today, the current trade-off we have between them is more due to the absence of clean separation between block-relay/tx-relay and a low number of block-relay-only.
Concept ACK
5034215
to
9c2babb
Compare
This PR is now updated to be a non-controversial improvement:
I dropped the part about BLOCK_RELAY diversity. |
9c2babb
to
c7699ee
Compare
c7699ee
to
d2c7296
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK d2c7296
The commit message of 513d52b p2p: Account for MANUAL conns when diversifying persistent outbound conns
reads: "this doesn't affect how we choose MANUAL connections" which I find a bit confusing because we (== the Bitcoin Core software) never chooses manual connections - they are configured by the user.
…onns Previously, we would make connections to peer from the netgroups to which our MANUAL outbound connections belong. However, they should be seen as regular connections from Addrman when it comes to netgroup diversity check, since the same rationale can be applied. Note, this has nothing to do with how we connect to MANUAL connections: we connect to them unconditionally.
d2c7296
to
aeec31d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK aeec31d
Code Review ACK aeec31d
I guess it depends on why you connected to the manual peer: Sometimes you may have a higher trust in that node, maybe because you know the operator. In that case, you wouldn't necessarily see a greater risk that another node from the same group would be colluding with them. On the other hand, even if you do trust them, there could still be attacks on an ISP level that would apply to multiple nodes from one group, so I think it makes sense to include them in the diversification as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review ACK aeec31d
Verified rebase to master and debug build are ok.
This pull has ACKs by @vasild since 6 months, @mzumsande and myself since six weeks, and a concept ACK by @ariard, and has seen review by a number of contributors, might be RFM. |
Verified rebase to current master is still clean. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK not taking INBOUND
, ADDR_FETCH
or FEELER
connections into account, and taking MANUAL
connections into account.
I think the while (!interruptNet)
loop in ThreadOpenConnections()
is a bit of a mess, and should be refactored out into its own function. Adding extra assertions halfway through that loop that are testing a variable set further up in the function is probably making it slightly worse, not better.
// The following checks are never applied to MANUAL and ADDR_FETCH, | ||
// because those connections are made elsewhere. | ||
assert(conn_type != ConnectionType::MANUAL); | ||
assert(conn_type != ConnectionType::ADDR_FETCH); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this is an improvement. conn_type
is a local variable that is set in lines 1932-1988 above, and can never be set to ConnectionType::MANUAL
or ConnectionType::ADDR_FETCH
. What are these assertions preventing or protecting against?
// Short-lived outbound connections should not affect how we select outbound | ||
// peers from addrman. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to duplicate this comment.
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
This is nice to have. @naumenkogs, do you plan to rebase it? |
@vasild not in the next 2 weeks, but possibly afterwards. Feel free to mark for grabs i guess. |
Will close, and mark up for grabs for now. |
revived this in #27264. |
72e8ffd p2p: Account for MANUAL conns when diversifying persistent outbound conns (Gleb Naumenko) 3faae99 p2p: Diversify connections only w.r.t *persistent* outbound peers (Gleb Naumenko) Pull request description: Revives #19860. In order to make sure that our persistent outbound slots belong to different netgroups, distinct net groups of our peers are added to [`setConnected`](https://github.com/bitcoin/bitcoin/blob/8c4958bd4c06026dc108bc7f5f063d1f389d279b/src/net.cpp#L1716). We’d only open a persistent outbound connection to peers which have a different netgroup compared to those netgroups present in `setConnected`. **behaviour on master** we open persistent outbound connections to peers which have different netgroups compared to outbound full relay, block relay, addrfetch and feeler connection peers. **behaviour on PR** netgroup diversity is based on outbound full relay, block relay and manual connection peers. **rationale** - addrfetch and feeler connections are short lived connections and shouldn’t affect how we select outbound peers from addrman. - manual connections are like regular connections when viewed from addrman’s netgroup diversity point of view and should affect how we select outbound peers from addrman ACKs for top commit: amitiuttarwar: code review ACK 72e8ffd vasild: ACK 72e8ffd mzumsande: Code Review ACK 72e8ffd brunoerg: crACK 72e8ffd Tree-SHA512: 359451945a707b312ef6c2696a3a9d4256ab14dab9bd461cca4a52dae034db099012df6de3faef2f3fb38184b05996402ac280b681959483824419b6deb4db1a
This PR improves the logic around netgroup diversity of new connections. More specifically, it improves privacy and fixes a couple things I consider buggy. It also adds a bunch of documentation.
This PR changes the following:
ADDR_FETCH and FEELER conns are now not included incount_failure
consideration (see new documentation). Even though it makes sense to consider them if they are present at the moment of the call, we should not rely on that luck. Instead, we should be explicit.Prevents BLOCK_RELAY connections from affecting our further full-relay peer selection, because otherwise, an attacker with sufficient ADDR-related capabilities may infer to which netgroups those (supposedly more private) connections belong.Abandoning this feature for now because the trade-offs are unclear, see P2P meeting at #bitcoin-core-dev from Sept. 8.I’m also not sure whether new BLOCK_RELAY connections should be distinct w.r.t existent full outbound connections. Seems like another influence vector for an attacker with AddrMan poisoning capabilities, but this threat is probably too advanced. And having a good diversity of BLOCK_RELAY conns is very beneficial.