-
Notifications
You must be signed in to change notification settings - Fork 37.7k
net: don't try to relay to the address' originator #19763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I stumbled into this while testing #19728. |
This behavior is intentional, and the hash-based logic to determine where to send exploits it (there is a comment about The idea is that every incoming addr is only relayed to a fixed subset of 1 or 2 peers during windows of 24 hours. This means that if the same address is relayed again, it will go to the same peers (and likely not be sent at all as they already know it). This change would permit someone to unjustly give their address better propagation by sending it repeatedly. |
Alright, I see the problem this would cause. The condition added in this PR
Looks like the chance of this happening is
Is this "likely" actually "for sure"? Is this correct: "This means that the same address will not be re-relayed again in 24h because we would attempt to send it to the same peers and this will be a noop by |
@vasild Ah, I see. That's a good point, and should probably be improved. I think we can just make sure that the originating node is skipped in the list of considered candidates? |
0313eb7
to
e80a80b
Compare
Exactly! Done in e80a80b and extended a test so that it fails without this fix. Reopened for review. |
Interesting. Will review. |
Concept ACK |
What if the same address is relayed again but from a different peer? I think the list of nodes the address would be relayed to might not be the same, because now it depends on the address AND the origin and not just the address, so the list would only be deterministic for the same origin. |
@dergoegge, the hashKey of each relay-to candidate does not depend on the originator. Can you elaborate? I think the list of nodes the address would be relayed to does not depend on the originator, except that the originator will be skipped if he happens to be one of the two chosen nodes and in his place another node will be selected. Notice that |
@vasild while trying to come up with an example i realised that you are right and that i didn't fully understand how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK e80a80b
A few comments below, feel free to pick and choose.
src/net_processing.cpp
Outdated
const CAddress& addr, | ||
/** [in] Whether the address' network is reachable. We relay unreachable addresses less. */ | ||
bool fReachable, | ||
/** [in] Connection manager to choose nodes to relay to. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Doxygen style in developer-notes.md might be easier to read.
Line 1485: s/do/does/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is inconsistent with the style used in the rest of the project. If you want to update the project style for function comments, please open a PR to do that or raise in an irc meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guidelines already say
Use Doxygen-compatible comment blocks for functions
and the above is (was) doxygen-compatible, so I don't think there is anything to update in developer-notes.md
.
The used inline comments have numerous advantages over @param
which I will summarize and try to get some feedback to eventually get some explicit "allow that in new code".
Anyway, no need to block this PR with that, so I have changed it to use @param
.
test/functional/p2p_addr_relay.py
Outdated
def on_addr(self, message): | ||
for addr in message.addrs: | ||
assert_equal(addr.nServices, 9) | ||
assert_greater_than(8343, addr.port) | ||
assert_greater_than_or_equal(addr.port, 8333) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few suggestions for the test code
@@ -18,8 +18,6 @@ from test_framework.mininode import (
from test_framework.test_framework import BitcoinTestFramework
from test_framework.util import (
assert_equal,
- assert_greater_than,
- assert_greater_than_or_equal,
)
import time
@@ -42,8 +40,8 @@ class AddrReceiver(P2PInterface):
def on_addr(self, message):
for addr in message.addrs:
assert_equal(addr.nServices, 9)
- assert_greater_than(8343, addr.port)
- assert_greater_than_or_equal(addr.port, 8333)
+ if not 8333 <= addr.port < 8343:
+ raise AssertionError("Invalid addr.port of {} (8333-8342 expected)".format(addr.port))
assert addr.ip.startswith('123.123.123.')
self.num_ipv4_received += 1
@@ -59,14 +57,14 @@ class AddrTest(BitcoinTestFramework):
msg = msg_addr()
self.log.info('Send too-large addr message')
- msg.addrs = ADDRS * 101 # more than 1000 addresses in one message
+ msg.addrs = ADDRS * 101 # more than 1000 addresses in one message
with self.nodes[0].assert_debug_log(['addr message size = 1010']):
addr_source.send_and_ping(msg)
self.log.info('Check that addr message content is relayed and added to addrman')
num_receivers = 7
receivers = []
- for i in range(num_receivers):
+ for _ in range(num_receivers):
receivers.append(self.nodes[0].add_p2p_connection(AddrReceiver()))
msg.addrs = ADDRS
- with self.nodes[0].assert_debug_log([
- 'Added {} addresses from 127.0.0.1: 0 tried'.format(num_ipv4_addrs),
- 'received: addr (301 bytes) peer=0',
- ]):
+ with self.nodes[0].assert_debug_log(
+ [
+ "Added {} addresses from 127.0.0.1: 0 tried".format(num_ipv4_addrs),
+ "received: addr (301 bytes) peer=0",
+ ]
+ ):
addr_source.send_and_ping(msg)
# Every IPv4 address must be relayed to two peers, other than the | ||
# originating node (addr_source). | ||
ipv4_branching_factor = 2 | ||
assert_equal(total_ipv4_received, num_ipv4_addrs * ipv4_branching_factor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified this assertion fails on master with relay to fewer nodes:
AssertionError: not(17 == 20)
AssertionError: not(17 == 20)
AssertionError: not(19 == 20)
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Concept ACK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, code change in net_processing LGTM.
Code review ACK e80a80b |
e80a80b
to
bc45012
Compare
Rebased to resolve conflicts. |
bc45012
to
fd897f8
Compare
Picked some suggestions. |
I gave some more thought on this. We pick 2 random nodes to relay to regardless of whether those nodes know the address or not. If we happen to pick a node that knows the address we will not relay to it and so we will relay to less than the intended 2 nodes. The originator of the address knows the address and that is just a special case of the above. So, assuming address propagation works, maybe this is not an issue. If it is to be addressed then maybe something like the following should be devised:
|
Not sure why this was closed. I think the issue is worth addressing, or at least kept in mind...
I think it is an issue because it is unexpected behavior. We might think that we always relay to 2 (at least best-effort), but this issue makes it not true, and it's trivial to fix. I'm wondering what you didn't like about your solution? |
I closed it because it is fixing just a special case of a more widespread issue. Otherwise I don't think there is a technical issue with the patch. It will be an improvement, thus reopening. Let me illustrate with an example:
|
@vasild I agree with the description. Maybe once someone implements the "best way" this can be closed, but for now I think it's better keep this as a reference (and maybe merge this as an improvement with 0 overhead). |
utACK (other than the test code, which I did not review). I think this is an improvement even if addr relay can be further improved. I should add -- I think on balance we should be more concerned about ensuring that honest actors have their addresses relay well, more than we should be concerned about adversaries getting additional relay (I think @naumenkogs made a similar point above). My intuition right now is that it's easy for someone circumventing our addr relay system to get their own addresses to propagate very well, while honest peers spinning up take a long time to get their addresses gossiped on the network. This needs more research though. |
When Erlay for addr messages? |
To summarize, if I'm non overlooking anything, this PR has ACKs by @naumenkogs, @sdaftuar, and myself, a fourth ACK (pre-rebase) by @promag, and a concept ACK by @hebasto. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. This seems to be an incremental improvement. I can't think of any downside to doing this.
src/net_processing.cpp
Outdated
const CAddress& addr, | ||
/** [in] Whether the address' network is reachable. We relay unreachable addresses less. */ | ||
bool fReachable, | ||
/** [in] Connection manager to choose nodes to relay to. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is inconsistent with the style used in the rest of the project. If you want to update the project style for function comments, please open a PR to do that or raise in an irc meeting.
For each address to be relayed we "randomly" pick 2 nodes to send the address to (in `RelayAddress()`). However we do not take into consideration that it does not make sense to relay the address back to its originator (`CNode::PushAddress()` will do nothing in that case). This means that if the originator is among the "randomly" picked nodes, then we will relay to one node less than intended. Fix this by skipping the originating node when choosing candidates to relay to.
0b8dd95
to
7fabe0f
Compare
Looks like github just lost my comment. @jnewbery did you get it in email? |
7fabe0f
to
e80a80b
Compare
resurrected the comment by rewinding the PR to an old commit (e80a80b)
The guidelines already say
and the above is (was) doxygen-compatible, so I don't think there is anything to update in The used inline comments have numerous advantages over Anyway, no need to block this PR with that, so I have changed it to use |
e80a80b
to
7fabe0f
Compare
Changes since the ACKs - replace inline doxygen comments with |
ACK 7fabe0f (this time I looked at the test, and verified the test breaks in expected ways if I break the code). |
Re-reviewing as soon as I finish 19858. |
Not sure what is going on with the failed appveyor or cirrus ci builds; can someone with a better understanding of how those work take a look? |
utACK 7fabe0f (only net_processing changes. I haven't reviewed the test changes) |
(ci can be ignored. Looks like a ci integration problem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 7fabe0f per git range-diff b76abae fd897f8 7fabe0f
, change since last review is rebase and more readable Doxygen documentation
If you retouch, it might be more readable as
src/net_processing.cpp
- * @param[in] originator The peer that sent us the address. We don't want to relay it back.
- * @param[in] addr Address to relay.
- * @param[in] fReachable Whether the address' network is reachable. We relay unreachable
- * addresses less.
- * @param[in] connman Connection manager to choose nodes to relay to.
+ *
+ * @param[in] originator The peer that sent us the address. We don't want to relay it back.
+ * @param[in] addr Address to relay.
+ * @param[in] connman Connection manager to choose nodes to relay to.
+ * @param[in] fReachable Whether the address' network is reachable. We relay unreachable addresses less.
*/
-static void RelayAddress(const CNode& originator,
- const CAddress& addr,
- bool fReachable,
- const CConnman& connman)
+static void RelayAddress(const CNode& originator, const CAddress& addr, const CConnman& connman, bool fReachable)
{
unsigned int nRelayNodes = fReachable ? 2 : 1; // limited relaying of addresses outside our network(s)
@@ -2653,7 +2650,7 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
if (addr.nTime > nSince && !pfrom.fGetAddr && vAddr.size() <= 10 && addr.IsRoutable())
{
// Relay to a limited number of other nodes
- RelayAddress(pfrom, addr, fReachable, m_connman);
+ RelayAddress(pfrom, addr, m_connman, fReachable);
}
Ok, I am leaving it as is then. I can rebase or push a minor whitespace change to trigger a fresh CI run (invalidade ACKs). |
Summary: ``` For each address to be relayed we "randomly" pick 2 nodes to send the address to (in `RelayAddress()`). However we do not take into consideration that it does not make sense to relay the address back to its originator (`CNode::PushAddress()` will do nothing in that case). This means that if the originator is among the "randomly" picked nodes, then we will relay to one node less than intended. Fix this by skipping the originating node when choosing candidates to relay to. ``` Backport of [[bitcoin/bitcoin#19763 | core#19763]]. Depends on D10859. Ref T1696. Test Plan: ninja all check-all Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Maniphest Tasks: T1696 Differential Revision: https://reviews.bitcoinabc.org/D10860
For each address to be relayed we "randomly" pick 2 nodes to send the
address to (in
RelayAddress()
). However we do not take intoconsideration that it does not make sense to relay the address back to
its originator (
CNode::PushAddress()
will do nothing in that case).This means that if the originator is among the "randomly" picked nodes,
then we will relay to one node less than intended.
Fix this by skipping the originating node when choosing candidates to
relay to.