Skip to content

Conversation

naumenkogs
Copy link
Member

@naumenkogs naumenkogs commented Sep 2, 2020

This PR improves the logic around netgroup diversity of new connections. More specifically, it improves privacy and fixes a couple things I consider buggy. It also adds a bunch of documentation.

This PR changes the following:

  1. prevents short-lived ADDR_FETCH and FEELER connections from affecting our further peer selection, because they are short-lived and should not matter
  2. ADDR_FETCH and FEELER conns are now not included in count_failure consideration (see new documentation). Even though it makes sense to consider them if they are present at the moment of the call, we should not rely on that luck. Instead, we should be explicit.
  3. Prevents BLOCK_RELAY connections from affecting our further full-relay peer selection, because otherwise, an attacker with sufficient ADDR-related capabilities may infer to which netgroups those (supposedly more private) connections belong. Abandoning this feature for now because the trade-offs are unclear, see P2P meeting at #bitcoin-core-dev from Sept. 8.
  4. Look at the existing MANUAL peers when enforcing diversity of new outbound conns.

I’m also not sure whether new BLOCK_RELAY connections should be distinct w.r.t existent full outbound connections. Seems like another influence vector for an attacker with AddrMan poisoning capabilities, but this threat is probably too advanced. And having a good diversity of BLOCK_RELAY conns is very beneficial.

@naumenkogs naumenkogs marked this pull request as draft September 2, 2020 14:07
@naumenkogs
Copy link
Member Author

Draft until #19724 is merged.

@naumenkogs
Copy link
Member Author

naumenkogs commented Sep 2, 2020

There is an interesting overlap with #19858 (periodic BLOCK_RELAY conns to sync tips).

I think those temporarily BLOCK_RELAY conns probably should be selected similarly to the existent new BLOCK_RELAY conns, meaning diversified by both set_connected_full_relay and set_connected_block_relay. If many of our peers are Sybils, it also would improve the chance of not connecting to the same netgroup.

Similarly to the work in this PR, I think we better select them as diverse as we can. If an attacker is capable to influence this selection, making it less diverse (by dropping one of the checks) won't really help neither with the robustness of these selected connections nor with their privacy.

@DrahtBot DrahtBot added the P2P label Sep 2, 2020
@sdaftuar
Copy link
Member

sdaftuar commented Sep 2, 2020

Prevents BLOCK_RELAY connections from affecting our further full-relay peer selection, because otherwise, an attacker with sufficient ADDR-related capabilities may infer to which netgroups those (supposedly more private) connections belong.

Do you have a way to model what an attacker's capability would need to be in order to deduce something like this? On the face of it, it seems strange to me make this kind of peering behavior dependent of the order in which we connect peers (full-relay first or block-relay first), and in particular, my presumption would be that it's more important to aim for peer netgroup diversity (as a first-order solution to being eclipsed) than it is to worry about inference based on that diversity (which is a second-order concern with regards to eclipse attacks) -- but if the inference capability is strong then perhaps I'm mistaken in thinking that?

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 2, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #24170 (p2p, rpc: Manual block-relay-only connections with addnode by mzumsande)
  • #22910 (net: Encapsulate asmap in NetGroupManager by jnewbery)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@naumenkogs
Copy link
Member Author

naumenkogs commented Sep 3, 2020

@sdaftuar I agree that immediate diversity is more important than theoretical anti-inference. But since it's easy to get confused here, I want to point out that after the second commit:

  • new block relay conns are made with full diversity, same as before
  • new full relay conns are made distinct w.r.t. existing full relay conns

The difference is only that existing block relay conns don't affect new full relay conns.

So, like, just 2 of the existing peers, for now, won't participate in diversity enforcement. If we have more (say 8 block relay conns), we might introduce some randomness: like 4 conns are used to diversify new full conns, and 4 are hidden.

Do you have a way to model what an attacker's capability would need to be in order to deduce something like this?

No, not right now. I just assume an attacker can influence what our AddrMan has. The influence shouldn't even be that strong, they just have to force us to connect to a set of (given by them) peers on-demand with some probability.

For example, they exploited eviction and made us connect to them (via regular conns) exclusively. Now to eclipse us they have to find out about our block-relay-only conns and potentially disrupt those. May be feasible with these capabilities and enough protocol knowledge :)
Since block-relay-only conns are created (sort of) specifically to handle scenarios like this, I wanna make them more robust.


If this is not convincing, I can drop the commit until when/if I can actually demostrate a feasible attack.

@naumenkogs naumenkogs force-pushed the 2020-09-diverse-new-conn-fixes branch from c63c875 to 6be550c Compare September 3, 2020 10:52
@naumenkogs
Copy link
Member Author

naumenkogs commented Sep 3, 2020

Added one more improvement: diversify by MANUAL connections (consider them regular full outbound). The privacy considerations I have for BLOCK_RELAY don't apply here because MANUAL are not particularly private... (since they support full relay)

Not sure whether they should be excluded when considering count_failure, similarly to "Having one persistent outbound peer probably means it's from our local network [...]" reasoning in the comment.

@naumenkogs naumenkogs marked this pull request as ready for review September 3, 2020 11:33
@naumenkogs naumenkogs force-pushed the 2020-09-diverse-new-conn-fixes branch from 6be550c to b2e36ff Compare September 3, 2020 11:50
Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have don't consider not only AddrMan poisoning capabilities but also more powerful ones like connections-interference capabilities. E.g an adversary observing an outbound connection from Alice to Bob and repeatedly opening inbound towards Bob until Alice is dropped. More concretely, if we assume that an attacker can map a victim outbound full-relay peers through tx-relay and manipulate those connections liveliness, a serie of which netgroups have been visited can be established. Observing the missing ones among the publicly available peerspace should hint attacker towards which netgroups are occupied by victim's block-relay slots.

I understand this PR as aiming to provoke netgroup collisions among the set of full-relay and block-relay peers to avoid making observable the absence of it. And thus improving the non-observability of block-relay peers. Of course it has as a side-effect to decrease the global outbound peers diversity as full-relay netgroups may overlap with block-relay ones. I think this is acceptable as the diversity reduction come at the win of making it far harder to disrupt block-relay connections. Previously an attacker should have committed a higher number Sybils spread in more diverse netgroups and could have interfered with victim's honest block-relay to increase odds of its deployed Sybils being selected as block-relay. Now, an attacker may commit a lower number of Sybils spread in a less diverse netgroups but is more restrained to interfere, and thus likely decrease its odds of being selected.

Overall, I think diversity and network topology non-observability are improving eclipse-attack resistance on different axis. As of today, the current trade-off we have between them is more due to the absence of clean separation between block-relay/tx-relay and a low number of block-relay-only.

Concept ACK

@naumenkogs naumenkogs force-pushed the 2020-09-diverse-new-conn-fixes branch 2 times, most recently from 5034215 to 9c2babb Compare September 9, 2020 07:15
@naumenkogs
Copy link
Member Author

This PR is now updated to be a non-controversial improvement:

  • don't diversify by ADDR_FETCH and FEELER
  • do diversify by MANUAL

I dropped the part about BLOCK_RELAY diversity.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d2c7296

The commit message of 513d52b p2p: Account for MANUAL conns when diversifying persistent outbound conns reads: "this doesn't affect how we choose MANUAL connections" which I find a bit confusing because we (== the Bitcoin Core software) never chooses manual connections - they are configured by the user.

…onns

Previously, we would make connections to peer from the netgroups to which
our MANUAL outbound connections belong.
However, they should be seen as regular connections from Addrman when it comes to netgroup diversity check, since the same rationale can be applied.

Note, this has nothing to do with how we connect to MANUAL connections:
we connect to them unconditionally.
@naumenkogs naumenkogs force-pushed the 2020-09-diverse-new-conn-fixes branch from d2c7296 to aeec31d Compare September 15, 2021 09:08
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK aeec31d

@mzumsande
Copy link
Contributor

Code Review ACK aeec31d

I just think MANUAL is just no different from a regular outbound peer. If you have a MANUAL to netgroup XYZ, you probably don't want OUTBOUND_FULL_RELAY or whatever to XYZ either (previously we would allow this).

I guess it depends on why you connected to the manual peer: Sometimes you may have a higher trust in that node, maybe because you know the operator. In that case, you wouldn't necessarily see a greater risk that another node from the same group would be colluding with them.

On the other hand, even if you do trust them, there could still be attacks on an ISP level that would apply to multiple nodes from one group, so I think it makes sense to include them in the diversification as well.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review ACK aeec31d

Verified rebase to master and debug build are ok.

@fanquake fanquake requested review from jnewbery and sipa January 28, 2022 08:40
@jonatack
Copy link
Member

This pull has ACKs by @vasild since 6 months, @mzumsande and myself since six weeks, and a concept ACK by @ariard, and has seen review by a number of contributors, might be RFM.

@jonatack
Copy link
Member

Verified rebase to current master is still clean.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK not taking INBOUND, ADDR_FETCH or FEELER connections into account, and taking MANUAL connections into account.

I think the while (!interruptNet) loop in ThreadOpenConnections() is a bit of a mess, and should be refactored out into its own function. Adding extra assertions halfway through that loop that are testing a variable set further up in the function is probably making it slightly worse, not better.

Comment on lines +2040 to +2044
// The following checks are never applied to MANUAL and ADDR_FETCH,
// because those connections are made elsewhere.
assert(conn_type != ConnectionType::MANUAL);
assert(conn_type != ConnectionType::ADDR_FETCH);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this is an improvement. conn_type is a local variable that is set in lines 1932-1988 above, and can never be set to ConnectionType::MANUAL or ConnectionType::ADDR_FETCH. What are these assertions preventing or protecting against?

Comment on lines +1920 to +1921
// Short-lived outbound connections should not affect how we select outbound
// peers from addrman.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to duplicate this comment.

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@vasild
Copy link
Contributor

vasild commented Oct 24, 2022

This is nice to have. @naumenkogs, do you plan to rebase it?

@naumenkogs
Copy link
Member Author

@vasild not in the next 2 weeks, but possibly afterwards. Feel free to mark for grabs i guess.

@fanquake
Copy link
Member

fanquake commented Dec 5, 2022

Feel free to mark for grabs i guess.

Will close, and mark up for grabs for now.

@stratospher
Copy link
Contributor

revived this in #27264.

fanquake added a commit that referenced this pull request Mar 19, 2023
72e8ffd p2p: Account for MANUAL conns when diversifying persistent outbound conns (Gleb Naumenko)
3faae99 p2p: Diversify connections only w.r.t *persistent* outbound peers (Gleb Naumenko)

Pull request description:

  Revives #19860.

  In order to make sure that our persistent outbound slots belong to different netgroups, distinct net groups of our peers are added to [`setConnected`](https://github.com/bitcoin/bitcoin/blob/8c4958bd4c06026dc108bc7f5f063d1f389d279b/src/net.cpp#L1716). We’d only open a persistent outbound connection to peers which have a different netgroup compared to those netgroups present in `setConnected`.

  **behaviour on master**

  we open persistent outbound connections to peers which have different netgroups compared to outbound full relay, block relay, addrfetch and feeler connection peers.

  **behaviour on PR**

  netgroup diversity is based on outbound full relay, block relay and manual connection peers.

  **rationale**

  - addrfetch and feeler connections are short lived connections and shouldn’t affect how we select outbound peers from addrman.
  - manual connections are like regular connections when viewed from addrman’s netgroup diversity point of view and should affect how we select outbound peers from addrman

ACKs for top commit:
  amitiuttarwar:
    code review ACK 72e8ffd
  vasild:
    ACK 72e8ffd
  mzumsande:
    Code Review ACK 72e8ffd
  brunoerg:
    crACK 72e8ffd

Tree-SHA512: 359451945a707b312ef6c2696a3a9d4256ab14dab9bd461cca4a52dae034db099012df6de3faef2f3fb38184b05996402ac280b681959483824419b6deb4db1a
@bitcoin bitcoin locked and limited conversation to collaborators Mar 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.