p2p: Don't self-advertise during version processing #26199

mzumsande · 2022-09-28T21:43:54Z

This picks up the last commit from #19843.

Previously, we would prepare to self-announce to a new peer while parsing a version message from that peer.
This is redundant, because we do something very similar in MaybeSendAddr(), which is called from SendMessages() after
the version handshake is finished.

There are a couple of differences:

MaybeSendAddr() self-advertises to all peers we do address relay with, not just outbound ones.
GetLocalAddrForPeer() called from MaybeSendAddr() makes a probabilistic decision to either advertise what they think we are or what we think we are, while PushAddress() on version deterministically only does the former if the address from the latter is unroutable.
During version processing, we haven't received a potential sendaddrv2 message from our peer yet, so self-advertisements with addresses from addrV2-only networks would always be dropped in PushAddress().

Since it's confusing to have two slightly different mechanisms for self-advertising, and the one in MaybeSendAddr() is better, remove the one in version.

Previously, we would prepare to self-announce to a new peer while parsing a VERSION message from that peer. This is redundant, because we do something very similar in MaybeSendAddr(), which is called from SendMessages() after the version handshake is finished. There are a couple of differences: 1) MaybeSendAddr() self-advertises to all peers we do address relay with, not just outbound ones. 2) GetLocalAddrForPeer() called from MaybeSendAddr() makes a probabilistic decision to either advertise what they think we are or what we think we are, while PushAddress(self) on VERSION deterministically only does the former if the address from the latter is unroutable. 3) During VERSION processing, we haven't received a potential sendaddrv2 message from our peer yet, so self-advertisements with addresses from addrV2-only networks would always be dropped in PushAddress(). Since it's confusing to have two slightly different mechanisms for self-advertising, and the one in MaybeSendAddr() is better, remove the one in VERSION. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

mzumsande · 2022-09-28T21:53:51Z

FYI @naumenkogs @dergoegge

sipa · 2022-09-28T22:05:31Z

Concept ACK

DrahtBot · 2022-09-28T23:18:18Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	amitiuttarwar, stratospher, naumenkogs
Concept ACK	sipa, dergoegge, glozow

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25619 (net: avoid overriding non-virtual ToString() in CService and use better naming by vasild)
#24008 (assumeutxo: net_processing changes by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

naumenkogs · 2022-09-29T07:43:43Z

src/net_processing.cpp

@@ -3274,39 +3274,16 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
            m_num_preferred_download_peers += state->fPreferredDownload;
        }

-        // Self advertisement & GETADDR logic
        if (!pfrom.IsInboundConn() && SetupAddressRelay(pfrom, *peer)) {


SetupAddressRelay used to have a misleading comment: "First addr message we have received from the peer, initialize m_addr_known". We were not receiving any addr messages from the peer here, yet we initialized it. It seems like this was the case from the beginning.

After this PR, calling SetupAddressRelay here seems unnecessary. If we drop it, the comment becomes valid :)
Although the inconsistency is a good signal the comment is confusing, so perhaps we should drop it. Not very useful anyway, at least the current code is pretty readable.

So yeah: 1) Drop SetupAddressRelay here; 2) Possibly drop the useless comment from SetupAddressRelay?

Alright, this has side-effects. MaybeSendAddr (another self-ad) will stop working until we get an ADDR message from the peer (not the case before the change i suggest here).

Perhaps this code should be:

if (!pfrom.IsInboundConn()) { SetupAddressRelay(pfrom, *peer); .... }

I think the current code is correct, SetupAddressRelay here has three tasks:

Initialize m_addr_known for outbound peers (unless block-relay-only), see [p2p] Reduce addr blackholes #21528

Prevent Self-Advertisements to block-relay-only outbound peers (for which SetupAddressRelay returns false)

Prevent a GETADDR to block-relay-only outbound peers

Since 1) and 3) are still valid after this PR, I think we both need to keep calling SetupAddressRelay and skip if it returns false.
I think this should be documented better (I had to revisit 21528 to recall how it worked while preparing this PR), so I will add some more documentation for this soon.

The refactoring I suggested in my last message here is indeed wrong, because it ignores the returned value.

Do you agree that the comment inside SetupAddressRelay is wrong though? That's the only thing that remains worrying me.

Do you agree that the comment inside SetupAddressRelay is wrong though?

Yeah, it's incomplete because it only applies to inbound peers. I added documentation there and before the call to SetupAddressRelay during version processing (as a second commit, because it's not directly related to the changes).

nit: "Prevent GETADDR from being sent" sounds very complicated, and also might have additional unnecessary meaning of pro-actively preventing something.

I'd say "Don't send GETADDR [...]" is better. Feel free to ignore.

missed this comment, changed wording now.

agreed that the comment in SetupAddressRelay was misleading. thanks for catching / fixing. I didn't hunt down the history, but I bet that was my fault 😅

RE current comment about "Don't send GETADDR...":
I find it a bit confusing, because we're also not sending GETADDR message to inbound peers. maybe this code is too dense and is better communicated in a bit more verbose manner?

eg. in non-syntax-adhering code:

bool send_getaddr = false; if (!pfrom.IsInboundConn()) { send_getaddr = SetupAddressRelay(); } if (send_getaddr){ // do GETADDR stuff }

also cool if you want to skip the refactor and just want to update the comment to something like..

// Initialize address relay for all outbound peers // Send GETADDR to non-block-relay-only outbound peers

I wish we had a better way to refer to non-block-relay-only outbound peers 😬

I agree that it this would be better to understand if it's a bit more spread out, so I took your suggestion and changed the comment - hopefully it's clear now?

dergoegge · 2022-09-30T08:47:31Z

Concept ACK

naumenkogs · 2022-09-30T09:39:26Z

I dunno if I'm supposed to ACK this since this is my commit, but I approve the changes 3c43d9d I guess :)

An easy way to look at this change is to notice that this code is almost entirely a subset of MaybeSendAddr. At the same time, it heavily relies on MaybSendAddr to actually do something, not much would change if you drop the duplicate subset (what this PR does).

One real difference this could make is that due to the "makes a probabilistic decision to either advertise what they think we are or what we think we are" of this behavior, we could have ended up having two self-ads in one ADDR message (one added here, one added in MaybeSendAddr if they are different). I don't think this is a show-stopper for this PR. I don't think we want to preserve this behavior.

UPD:

Another important difference could be the time between the two calls. This might make sense if there's a risk of someone inserting our self-ad (PushAddress) with the wrong metadata, eg nTime. Then our latter self-ad insertion would be dismissed (because we just look up the key), while the VERSION-one is protected from this issue.
I don't think this is a show-stopper for this PR, although this surface could indeed be improved (e.g., handle duplicates better, or handle other-nodes-announce-us-through-us differently).

glozow

Concept ACK

mzumsande · 2022-09-30T18:26:03Z

One real difference this could make is that due to the "makes a probabilistic decision to either advertise what they think we are or what we think we are" of this behavior, we could have ended up having two self-ads in one ADDR message (one added here, one added in MaybeSendAddr if they are different). I don't think this is a show-stopper for this PR. I don't think we want to preserve this behavior.

Yes, advertising both addresses is something that can happen, but only in a few cases:
e.g. if the address during version message processing is AddrV2-only, it is not sent out because we haven't gotten sendaddrv2 yet. I think it might happen if we discovered a local IPv6 address and our peer sees us under a IPv4 address?
I think that if this is something we really want, we could always advertise both (similar to your commit 10fc62f) but I'm not sure it is necessary - after all, we have the nScore logic to make an educated guess which address is better.

Another important difference could be the time between the two calls. This might make sense if there's a risk of someone inserting our self-ad (PushAddress) with the wrong metadata, eg nTime. Then our latter self-ad insertion would be dismissed (because we just look up the key), while the VERSION-one is protected from this issue.
I don't think this is a show-stopper for this PR, although this surface could indeed be improved (e.g., handle duplicates better, or handle other-nodes-announce-us-through-us differently).

I think I don't understand this completely, but the time between advertising during version message and MaybeSendAddr should usually be negligible, because m_next_local_addr_send is initialized to 0, so MaybeSendAddr always advertises right after the version handshake is completed.
Also, we clear m_addr_known right before self-advertising in MaybeSendAddr (code) so it wouldn't be possible for an attacker to do something like preventing us from self-advertising by sending us our own address. Is that what you meant?

naumenkogs · 2022-10-03T08:35:09Z

ACK 24608c2

amitiuttarwar · 2022-11-22T02:37:20Z

code review ACK 1002cd1. this PR improves the encapsulation of self-advertisement logic, which simplifies the code as well as our mental models of addr relay on the network.

@naumenkogs

Another important difference could be the time between the two calls.

@mzumsande

I think I don't understand this completely, but the time between advertising during version message and MaybeSendAddr should usually be negligible, because m_next_local_addr_send is initialized to 0, so MaybeSendAddr always advertises right after the version handshake is completed.

Agreed, but I'd take it slightly further. I think there is no difference in when the advertisement would actually occur. the ProcessMessage -> version logic called PushAddress which (at best) adds the address to m_addrs_to_send. The actual constructing & sending of the message wouldn't occur until the SendMessages loop calls MaybeSendAddr.

My understanding of this PR is that there would be no tangible behavioral differences in the cases where the version logic returned the same address as MaybeSendAddr logic. on master, if they result in two different addr, I believe we would advertise both. after this PR, we would only advertise the result of GetLocalAddrForPeer, which will select one of the two.

Since the version logic always prioritized GetLocalAddress (aka what we think we are), the behavioral difference would occur in instances when GetLocalAddrForPeer (aka weighted guess) selects the result from IsPeerAddrLocalGood (aka what peer thinks we are).

I think the only time this would cause a connectivity issue is if your peers are somehow reporting your address to be something that can't be connected to, and you would have advertised something usable with GetLocalAddress. I expect this to be quite rare, but additionally your self-identified address would eventually get out when GetLocalAddrForPeer selects the result of GetLocalAddress.

I think that if this is something we really want, we could always advertise both (similar to your commit 10fc62f) but I'm not sure it is necessary - after all, we have the nScore logic to make an educated guess which address is better.

For the reasons I stated above, I agree. Advertising both seems unnecessary, and the worst case should just be slower time for inbounds to connect to you?

@mzumsande @naumenkogs curious to hear your feedback. do these claims track with your understanding?

amitiuttarwar · 2022-11-21T23:12:22Z

src/net_processing.cpp

@@ -3274,39 +3274,16 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
            m_num_preferred_download_peers += state->fPreferredDownload;
        }

-        // Self advertisement & GETADDR logic
        if (!pfrom.IsInboundConn() && SetupAddressRelay(pfrom, *peer)) {


agreed that the comment in SetupAddressRelay was misleading. thanks for catching / fixing. I didn't hunt down the history, but I bet that was my fault 😅

RE current comment about "Don't send GETADDR...":
I find it a bit confusing, because we're also not sending GETADDR message to inbound peers. maybe this code is too dense and is better communicated in a bit more verbose manner?

eg. in non-syntax-adhering code:

bool send_getaddr = false; if (!pfrom.IsInboundConn()) { send_getaddr = SetupAddressRelay(); } if (send_getaddr){ // do GETADDR stuff }

also cool if you want to skip the refactor and just want to update the comment to something like..

// Initialize address relay for all outbound peers // Send GETADDR to non-block-relay-only outbound peers

I wish we had a better way to refer to non-block-relay-only outbound peers 😬

src/net_processing.cpp

amitiuttarwar · 2022-11-22T17:08:09Z

oops, didn't post last night - just some suggestions around language in the comments. not essential to incorporate.

naumenkogs · 2022-11-23T07:49:06Z

Agreed, but I'd take it slightly further. I think there is no difference in when the advertisement would actually occur.

This was the initial purpose of the PR, yeah. In my reading of your comment, I'm not sure it "takes further" in that sense. My finding was more niche (again, probably very hypothetical) — if only the latter call is in place, the data could be hijacked. Trying to make sure we're on the same page :)

I agree with the rest of your comment.

This code was a bit hard to understand, so make it less dense and add more explanations. Doesn't change behavior. Co-authored-by: Amiti Uttarwar <amiti@uttarwar.org>

mzumsande · 2022-11-23T22:12:26Z

1002cd1 to 956c670: Applied suggestions by @amitiuttarwar to second commit - thanks!

do these claims track with your understanding?

yes, I agree!

the data could be hijacked.

I didn't understand this the first time, but I think I do now - you mean an attacker could continuously spam us with our own address from other connections (but with bad metadata), and thus prevent us from sending our self-advertisement (with correct metadata) to new peers because they'd inserted a duplicate in between our processing of the version messages and us getting to MaybeSendAddress().

I will have to think about this a bit more, but even if the timing issues and rate-limiting issues were solved, it would probably be really hard / impossible to that because of the way RelayAddress() relays a given address (no matter which peer it is received from) always to the same peers within 24 hours (which might change slightly with peers fluctuating, but not very much). It should be impossible for an attacker to make us relay a specific node in a targeted way to new nodes.

stratospher

Concept ACK. I went through the self advertisement logic and couldn't think of any concerning behavioral change this PR might induce. I do have a couple of questions though.

There were some entries in my node's debug.log where the peer would send incorrect entries(here, the peer's own address/they could be tampered with) for what address the peer thinks we are (vRecv >> addrMe).
- with the version processing mechanism, we'd self advertise our correct address at least once. (GetLocalAddress)
- with the MaybeSendAddr mechanism, we'd sometimes self advertise addrMe (what peer thinks we are/could be tampered with) only. (GetLocalAddrForPeer)
- The connections would last for only a short duration though(around 4 mins). Could this scenario be problematic?
even if we don't accept inbound connections, we still self-advertise. is it because we can't detect them?

mzumsande · 2022-11-28T16:58:28Z

There were some entries in my node's debug.log where the peer would send incorrect entries(here, the peer's own address/they could be tampered with) for what address the peer thinks we are (vRecv >> addrMe).

with the version processing mechanism, we'd self advertise our correct address at least once. (GetLocalAddress)

with the MaybeSendAddr mechanism, we'd sometimes self advertise addrMe (what peer thinks we are/could be tampered with) only. (GetLocalAddrForPeer)

The connections would last for only a short duration though(around 4 mins). Could this scenario be problematic?

I think that it's not possible to say in general which address (the local or peer's one) is better, or "correct". In some cases, the peer may have a better idea how we are reachable than we do. In other cases, it's the other way round, or maybe neither are working if we're just not reachable by others.

Also note that if we overrule our own local address and advertise with the address a peer sees us with, we'll only do that back to that same peer. We never send an addrMe address from one peer to other peers.
Therefore, I think that peers can only influence the weight (nScore) between multiple local addresses (all of which we detected by ourselves at startup!) via SeenLocal() but never insert new ones that they pick - this should make it impossible to trick us into advertising an incorrect address to other peers. If a peer is malicious, they can just discard the self-announcements we send them anyway, so it's not an attack if such a peer makes us advertise to them with a wrong address.

even if we don't accept inbound connections, we still self-advertise. is it because we can't detect them?

If we explicitly disable inbound connections (fListen == false), we don't self-advertise, see here.
But yes, it is a very common situation (e.g. if we're behind a NAT, or the port for bitcoin is blocked by our firewall) that bitcoind isn't able to detect that we're actually unreachable - so self-advertisements happen in this case.

amitiuttarwar

@naumenkogs @mzumsande
thanks for the feedback. I think I now better understand the hypothetical issue in terms of when we queue up the addresses internally, not when we actually trigger the message. due to the various mechanisms Martin mentioned, I think most viable attacks would be mitigated.

reACK 956c670

stratospher · 2022-12-01T16:45:58Z

ACK 956c670

naumenkogs · 2022-12-02T08:41:03Z

ACK 956c670

956c670 refactor, doc: Improve SetupAddressRelay call in version processing (Martin Zumsande) 3c43d9d p2p: Don't self-advertise during VERSION processing (Gleb Naumenko) Pull request description: This picks up the last commit from bitcoin#19843. Previously, we would prepare to self-announce to a new peer while parsing a `version` message from that peer. This is redundant, because we do something very similar in `MaybeSendAddr()`, which is called from `SendMessages()` after the version handshake is finished. There are a couple of differences: 1) `MaybeSendAddr()` self-advertises to all peers we do address relay with, not just outbound ones. 2) `GetLocalAddrForPeer()` called from `MaybeSendAddr()` makes a probabilistic decision to either advertise what they think we are or what we think we are, while `PushAddress()` on `version` deterministically only does the former if the address from the latter is unroutable. 3) During `version` processing, we haven't received a potential sendaddrv2 message from our peer yet, so self-advertisements with addresses from addrV2-only networks would always be dropped in `PushAddress()`. Since it's confusing to have two slightly different mechanisms for self-advertising, and the one in `MaybeSendAddr()` is better, remove the one in `version`. ACKs for top commit: stratospher: ACK 956c670 naumenkogs: ACK 956c670 amitiuttarwar: reACK 956c670 Tree-SHA512: 933d40615289f055c022170dde7bad0ac0a1d4be377538bfe9ba64375cfeb03bcd803901591f0739ac4850c880e8475a68fd1ab0330800030ab7f19e38c00274

DrahtBot added the P2P label Sep 28, 2022

naumenkogs reviewed Sep 29, 2022

View reviewed changes

glozow reviewed Sep 30, 2022

View reviewed changes

DrahtBot mentioned this pull request Oct 5, 2022

net: avoid overriding non-virtual ToString() in CService and use better naming #25619

Merged

mzumsande force-pushed the 202209_advertise_once branch from 24608c2 to 1002cd1 Compare October 13, 2022 21:30

This was referenced Nov 10, 2022

assumeutxo: net_processing changes #24008

Closed

assumeutxo #15606

Closed

amitiuttarwar reviewed Nov 22, 2022

View reviewed changes

refactor, doc: Improve SetupAddressRelay call in version processing

956c670

This code was a bit hard to understand, so make it less dense and add more explanations. Doesn't change behavior. Co-authored-by: Amiti Uttarwar <amiti@uttarwar.org>

mzumsande force-pushed the 202209_advertise_once branch from 1002cd1 to 956c670 Compare November 23, 2022 21:20

stratospher reviewed Nov 24, 2022

View reviewed changes

amitiuttarwar approved these changes Nov 30, 2022

View reviewed changes

fanquake requested a review from naumenkogs December 1, 2022 16:53

maflcko merged commit 6061eb6 into bitcoin:master Dec 12, 2022

mzumsande deleted the 202209_advertise_once branch December 12, 2022 16:30

bitcoin locked and limited conversation to collaborators Dec 12, 2023

p2p: Don't self-advertise during version processing #26199

p2p: Don't self-advertise during version processing #26199

Uh oh!

Conversation

mzumsande commented Sep 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mzumsande commented Sep 28, 2022

Uh oh!

sipa commented Sep 28, 2022

Uh oh!

DrahtBot commented Sep 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mzumsande Sep 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dergoegge commented Sep 30, 2022

Uh oh!

naumenkogs commented Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

mzumsande commented Sep 30, 2022

Uh oh!

naumenkogs commented Oct 3, 2022

Uh oh!

amitiuttarwar commented Nov 22, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amitiuttarwar commented Nov 22, 2022

Uh oh!

naumenkogs commented Nov 23, 2022

Uh oh!

mzumsande commented Nov 23, 2022

Uh oh!

stratospher left a comment

Choose a reason for hiding this comment

Uh oh!

mzumsande commented Nov 28, 2022

Uh oh!

amitiuttarwar left a comment

Choose a reason for hiding this comment

Uh oh!

stratospher commented Dec 1, 2022

Uh oh!

naumenkogs commented Dec 2, 2022

Uh oh!

Uh oh!

mzumsande commented Sep 28, 2022 •

edited

Loading

DrahtBot commented Sep 28, 2022 •

edited

Loading

mzumsande Sep 29, 2022 •

edited

Loading

naumenkogs commented Sep 30, 2022 •

edited

Loading