Skip to content

Conversation

sipa
Copy link
Member

@sipa sipa commented Sep 29, 2021

The original CAddrMan behaviour (before #5941) was to pick a uniformly random non-empty bucket, and then pick a random element from that bucket. That commit, which introduced deterministic placement of entries in buckets, changed this to picking a uniformly random non-empty bucket position instead.

I believe that was a mistake. Buckets are our best metric for spreading out requests across independently-controlled nodes. That
does mean that if a bucket has fewer entries, its entries are supposed to be picked more frequently.

This PR reverts to the original high-level behavior, but on top of the deterministic placement logic.

@sipa sipa force-pushed the 202109_addrmanbias branch from fd82d9d to 8fd67d8 Compare September 29, 2021 20:30
@DrahtBot DrahtBot added the P2P label Sep 29, 2021
@amitiuttarwar
Copy link
Contributor

concept ACK

@naumenkogs
Copy link
Member

Okay so here's what I think is happening:

Before this PR, we pick bucket+position randomly. If that's empty, we pick bucket+position randomly again.
After this PR, we pick bucket+position randomly. If that's empty, we stay in the same bucket and go from randomly_picked_position to the end of the bucket. If haven't found anything, pick bucket+position randomly again.

That does mean that if a bucket has fewer entries, its entries are supposed to be picked more frequently.

I think this is a good intuitive high-level justification of the proposed change.

So, I don't think this can be directly exploited at least easily (because we randomly hash), but it just makes the connectivity better overall, for a given node (and perhaps for the network overall)?
Since a bucket is a means of diversification, making them all equal makes sense.

But also, after this change, ending up in a less-populated bucket is "luckier" for that node. That probably evens out across all nodes in the network, so that should be fine.

Concept ACK.

Copy link
Contributor

@rajarshimaitra rajarshimaitra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Below is a conceptual question for my own understanding.

@sipa sipa force-pushed the 202109_addrmanbias branch from 8fd67d8 to 0f47d86 Compare September 30, 2021 17:23
src/addrman.cpp Outdated
const auto it_found{mapInfo.find(nId)};
assert(it_found != mapInfo.end());
const CAddrInfo& info{it_found->second};
if (insecure_rand.randbits(30) < fChanceFactor * info.GetChance() * (1 << 30)) return info;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: while you touching this, we could also make this more readable by avoiding bit operations :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions?

Copy link
Member

@naumenkogs naumenkogs Oct 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant insecure_rand.randrange(XYZ) < fChanceFactor * info.GetChance() * XYZ and defining XYZ should just work?

@naumenkogs
Copy link
Member

ACK 0f47d86

I think the code is correct, I haven't made any experiments to verify probabilities.

@sipa sipa force-pushed the 202109_addrmanbias branch from 0f47d86 to 318a8b2 Compare October 1, 2021 15:21
@jnewbery
Copy link
Contributor

jnewbery commented Oct 1, 2021

When we learn of a new address from a source peer, I believe we essentially:

  • select 64 of the 1024 new buckets based on the source peer's netgroup
  • then select the specific bucket out of those 64 based on the destination's netgroup
  • then select the bucket position (out of 64) based on the destination's network address (and the bucket index)

That means that if we have one or more source peers that are filling up our new table with addresses, each one will be able to fill up to 64 new buckets with 64 addresses each (4096 addresses).

This proposed change means that if a minority of source peers are sending a disproportionately high number of address records, the addresses that they're providing won't get disproportionately chosen by Select(). I think it also means that we're more likely to try to connect to addresses that we learn about via self-broadcast gossiping from inbound nodes than currently. I expect that (initially at least) our new tables are dominated by addresses that we learn from DNS seeds or from getaddr responses, but that those records are concentrated into specific buckets. by selecting bucket first, then position, we're more likely to select from an under-filled bucket, which would have been populated by address gossip.

Those seems like good changes to me.

When we promote an address to the tried table, we:

  • select 8 of the 256 buckets based on the destination's netgroup
  • then select the specific bucket based on the destination's network address
  • then select the bucket position based on the destination's network address (and the bucket index)

That means that if we successfully connect to a peer in an under-represented netgroup, it'll be placed in a more sparsely-populated tried bucket than if we successfully connect to a peer in a more common netgroup. With this change, we're more likely to choose that address again when Select() is called. If this change is applied to nodes across the network, then nodes on less well represented netgroups will receive slightly more inbound requests on average than those on common netgroups.

So, changing the way that Select() picks entries from the new and tried tables does two slightly different things:

  1. for the new tables, it means that each address source netgroup is (closer to) equally likely to be picked. That's an improvement because we'll no longer favour addresses from spammy address sources.
  2. for the tried tables, it means that each destination netgroup is (closer to) equally like to be be picked, but this is only for addresses that we've connected to at some point in the past.

As far as I can understand, the justification for the PR describes (2), but I think (1) is far more significant, since there are always more entries in the new tables, and because for feeler connections we only select from the new tables.

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 1, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@naumenkogs
Copy link
Member

ACK 318a8b2


I was mainly thinking about new nodes, where the benefit is clear because bucket is selected based on the source.

W.r.t tried, I'm not even sure this makes any practical sense. We won't connect to the same netgroup anyway, right?
So, "destination netgroup is (closer to) equally like to be be picked" won't make any real difference (there will be some difference, but i think it doesn't matter).

The original CAddrMan behaviour (before commit
e6b343d) was to pick a uniformly
random non-empty bucket, and then pick a random element from that
bucket. That commit, which introduced deterministic placement
of entries in buckets, changed this to picking a uniformly random
non-empty bucket position instead.

This commit reverts the original high-level behavior, but in the
deterministic placement model.
@sipa sipa force-pushed the 202109_addrmanbias branch from 318a8b2 to 632aad9 Compare October 5, 2021 15:48
@sipa
Copy link
Member Author

sipa commented Oct 5, 2021

Rebased.

@jnewbery
Copy link
Contributor

jnewbery commented Oct 5, 2021

W.r.t tried, I'm not even sure this makes any practical sense. We won't connect to the same netgroup anyway, right?
So, "destination netgroup is (closer to) equally like to be be picked" won't make any real difference (there will be some difference, but i think it doesn't matter).

@naumenkogs - I agree. The logic in ThreadOpenConnections prevents us from making multiple outbound connections to peers in the same network group.

@sipa - what do you think about removing the change to picking a tried entry, and only having this logic for picking a new entry?

@sipa
Copy link
Member Author

sipa commented Oct 5, 2021

@jnewbery I prefer being consistent, unless there is actually a reason why this is worse for the tried table?

}
int nId = vvTried[nKBucket][nKBucketPos];
// If the bucket is entirely empty, start over with a (likely) different one.
if (i == ADDRMAN_BUCKET_SIZE) continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an infinite loop if all buckets are empty? It's unclear to me if that can be the case here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be, if the function didn't start with testing if the relevant table was empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, that relationship was unclear from the local code alone, but makes sense after a quick refresher of how vRandom works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's a bit obscure.

@jnewbery
Copy link
Contributor

jnewbery commented Oct 6, 2021

utACK 632aad9

@jnewbery I prefer being consistent, unless there is actually a reason why this is worse for the tried table?

No, I don't think it would be any worse for the tried table. I can see the benefit of being consistent for the two tables.

@naumenkogs
Copy link
Member

ACK 632aad9

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, one question about the bucket iteration logic below.

W.r.t tried, I'm not even sure this makes any practical sense. We won't connect to the same netgroup anyway, right?
So, "destination netgroup is (closer to) equally like to be be picked" won't make any real difference (there will be some difference, but i think it doesn't matter).

The small difference might be that before, there was a tendency to select more tried items from the same netgroup, leading to retries in ThreadOpenConnections if the netgroup was already used for an outbound, and therefore leading to a slight bias towards selecting entries from new vs tried (because the next Select() call would again choose from new with p=0.5). But I agree that this is not material, and even if it was, the new behavior makes more sense.

// Iterate over the positions of that bucket, starting at the initial one,
// and looping around.
int i;
for (i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this way of iterating does not select all items in a bucket with the same probability when the bucket is partially filled. In the most extreme case of just two items in adjacent positions, one item would get picked with probability 63/64, the other one with probability 1/64 because we start at a random position and then move from left to right until we find something. Do you think that this is a problem? Was the older version of this PR with nUBucketPos ^ i maybe better because of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're both equally non-uniform in this regard. I can try adding something to make it actually uniform if there is a concern, but given that the contents of the buckets is secret and assumed not to be under attacker control, I don't think it matters very much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that this cannot be abused. Just a small quirk that in a given addrman configuration, the selection is not always perfectly uniform.

@mzumsande
Copy link
Contributor

ACK 632aad9

@laanwj laanwj merged commit 58275db into bitcoin:master Oct 22, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 22, 2021
…ions, first

632aad9 Make CAddrman::Select_ select buckets, not positions, first (Pieter Wuille)

Pull request description:

  The original CAddrMan behaviour (before bitcoin#5941) was to pick a uniformly random non-empty bucket, and then pick a random element from that bucket. That commit, which introduced deterministic placement of entries in buckets, changed this to picking a uniformly random non-empty bucket position instead.

  I believe that was a mistake. Buckets are our best metric for spreading out requests across independently-controlled nodes. That
  does mean that if a bucket has fewer entries, its entries are supposed to be picked more frequently.

  This PR reverts to the original high-level behavior, but on top of the deterministic placement logic.

ACKs for top commit:
  jnewbery:
    utACK 632aad9
  naumenkogs:
    ACK 632aad9
  mzumsande:
    ACK 632aad9

Tree-SHA512: 60768afba2b6f0abd0dddff04381cab5acf374df48fc0e883ee16dde7cf7fd33056a04b573cff24a1b4d8e2a645bf0f0b3689eec84da4ff330e7b59ef142eca1
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 21, 2022
Summary:
```
The original CAddrMan behaviour (before commit
e6b343d) was to pick a uniformly random non-empty bucket, and then pick a random element from that bucket. That commit, which introduced deterministic placement of entries in buckets, changed this to picking a uniformly random non-empty bucket position instead.

This commit reverts the original high-level behavior, but in the deterministic placement model.
```

Backport of [[bitcoin/bitcoin#23140 | core#23140]].

Depends on D12339.

Test Plan:
  ninja all check-all

Reviewers: #bitcoin_abc, sdulfari

Reviewed By: #bitcoin_abc, sdulfari

Differential Revision: https://reviews.bitcoinabc.org/D12340
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants