Skip to content

Conversation

vasild
Copy link
Contributor

@vasild vasild commented Aug 18, 2020

For each address to be relayed we "randomly" pick 2 nodes to send the
address to (in RelayAddress()). However we do not take into
consideration that it does not make sense to relay the address back to
its originator (CNode::PushAddress() will do nothing in that case).

This means that if the originator is among the "randomly" picked nodes,
then we will relay to one node less than intended.

Fix this by skipping the originating node when choosing candidates to
relay to.

@vasild
Copy link
Contributor Author

vasild commented Aug 18, 2020

I stumbled into this while testing #19728.

@sipa
Copy link
Member

sipa commented Aug 18, 2020

This behavior is intentional, and the hash-based logic to determine where to send exploits it (there is a comment about m_addr_known in RelayAddress).

The idea is that every incoming addr is only relayed to a fixed subset of 1 or 2 peers during windows of 24 hours. This means that if the same address is relayed again, it will go to the same peers (and likely not be sent at all as they already know it). This change would permit someone to unjustly give their address better propagation by sending it repeatedly.

@vasild vasild marked this pull request as draft August 18, 2020 20:16
@vasild
Copy link
Contributor Author

vasild commented Aug 18, 2020

Alright, I see the problem this would cause.

The condition added in this PR !pnode->m_addr_known->contains(addr) is too strong. I only intended to avoid relaying to the node that sent us the address, trying to avoid this adverse scenario:

  • let node0 have connections to 8 other nodes: node1, ..., node8.
  • node5 sends some address to node0
  • node0 is about to relay it to 2 "random" nodes and picks node3 and node5
  • node5 already knows the address and so is skipped by PushAddress()
  • the address is relayed to just node3 - one node instead of two 💣.

Looks like the chance of this happening is 1/8, or 12.5% 1/4, or 25%.

This means that if the same address is relayed again, it will go to the same peers (and likely not be sent at all as they already know it)

Is this "likely" actually "for sure"? Is this correct: "This means that the same address will not be re-relayed again in 24h because we would attempt to send it to the same peers and this will be a noop by PushAddress()"? (assuming the list of peers does not change).

@sipa
Copy link
Member

sipa commented Aug 18, 2020

@vasild Ah, I see. That's a good point, and should probably be improved. I think we can just make sure that the originating node is skipped in the list of considered candidates?

@DrahtBot DrahtBot added the P2P label Aug 18, 2020
@vasild vasild force-pushed the only_relay_to_unaware branch from 0313eb7 to e80a80b Compare August 19, 2020 10:38
@vasild vasild changed the title net: don't try to relay to node that knows an addr net: don't try to relay to the address' originator Aug 19, 2020
@vasild vasild marked this pull request as ready for review August 19, 2020 10:40
@vasild
Copy link
Contributor Author

vasild commented Aug 19, 2020

make sure that the originating node is skipped in the list of considered candidates

Exactly! Done in e80a80b and extended a test so that it fails without this fix.

Reopened for review.

@jonatack
Copy link
Member

Interesting. Will review.

@sipa
Copy link
Member

sipa commented Aug 19, 2020

Concept ACK

@dergoegge
Copy link
Member

What if the same address is relayed again but from a different peer? I think the list of nodes the address would be relayed to might not be the same, because now it depends on the address AND the origin and not just the address, so the list would only be deterministic for the same origin.

@vasild
Copy link
Contributor Author

vasild commented Aug 20, 2020

@dergoegge, the hashKey of each relay-to candidate does not depend on the originator. Can you elaborate? I think the list of nodes the address would be relayed to does not depend on the originator, except that the originator will be skipped if he happens to be one of the two chosen nodes and in his place another node will be selected.

Notice that CSipHasher(hasher).Write(pnode->GetId()).Finalize() creates a temporary which is destroyed before the next candidate is evaluated.

@dergoegge
Copy link
Member

@vasild while trying to come up with an example i realised that you are right and that i didn't fully understand how sortfunc works (even though it has "sort" in the name 😅), sorry.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK e80a80b

A few comments below, feel free to pick and choose.

const CAddress& addr,
/** [in] Whether the address' network is reachable. We relay unreachable addresses less. */
bool fReachable,
/** [in] Connection manager to choose nodes to relay to. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Doxygen style in developer-notes.md might be easier to read.

Line 1485: s/do/does/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is inconsistent with the style used in the rest of the project. If you want to update the project style for function comments, please open a PR to do that or raise in an irc meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidelines already say

Use Doxygen-compatible comment blocks for functions

and the above is (was) doxygen-compatible, so I don't think there is anything to update in developer-notes.md.

The used inline comments have numerous advantages over @param which I will summarize and try to get some feedback to eventually get some explicit "allow that in new code".

Anyway, no need to block this PR with that, so I have changed it to use @param.

def on_addr(self, message):
for addr in message.addrs:
assert_equal(addr.nServices, 9)
assert_greater_than(8343, addr.port)
assert_greater_than_or_equal(addr.port, 8333)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few suggestions for the test code

@@ -18,8 +18,6 @@ from test_framework.mininode import (
 from test_framework.test_framework import BitcoinTestFramework
 from test_framework.util import (
     assert_equal,
-    assert_greater_than,
-    assert_greater_than_or_equal,
 )
 import time
 
@@ -42,8 +40,8 @@ class AddrReceiver(P2PInterface):
     def on_addr(self, message):
         for addr in message.addrs:
             assert_equal(addr.nServices, 9)
-            assert_greater_than(8343, addr.port)
-            assert_greater_than_or_equal(addr.port, 8333)
+            if not 8333 <= addr.port < 8343:
+                raise AssertionError("Invalid addr.port of {} (8333-8342 expected)".format(addr.port))
             assert addr.ip.startswith('123.123.123.')
             self.num_ipv4_received += 1
 
@@ -59,14 +57,14 @@ class AddrTest(BitcoinTestFramework):
         msg = msg_addr()
 
         self.log.info('Send too-large addr message')
-        msg.addrs = ADDRS * 101 # more than 1000 addresses in one message
+        msg.addrs = ADDRS * 101  # more than 1000 addresses in one message
         with self.nodes[0].assert_debug_log(['addr message size = 1010']):
             addr_source.send_and_ping(msg)
 
         self.log.info('Check that addr message content is relayed and added to addrman')
         num_receivers = 7
         receivers = []
-        for i in range(num_receivers):
+        for _ in range(num_receivers):
             receivers.append(self.nodes[0].add_p2p_connection(AddrReceiver()))
         msg.addrs = ADDRS
-        with self.nodes[0].assert_debug_log([
-                'Added {} addresses from 127.0.0.1: 0 tried'.format(num_ipv4_addrs),
-                'received: addr (301 bytes) peer=0',
-        ]):
+        with self.nodes[0].assert_debug_log(
+            [
+                "Added {} addresses from 127.0.0.1: 0 tried".format(num_ipv4_addrs),
+                "received: addr (301 bytes) peer=0",
+            ]
+        ):
             addr_source.send_and_ping(msg)

# Every IPv4 address must be relayed to two peers, other than the
# originating node (addr_source).
ipv4_branching_factor = 2
assert_equal(total_ipv4_received, num_ipv4_addrs * ipv4_branching_factor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified this assertion fails on master with relay to fewer nodes:

AssertionError: not(17 == 20)
AssertionError: not(17 == 20)
AssertionError: not(19 == 20)

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 20, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@naumenkogs
Copy link
Member

Concept ACK.

Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, code change in net_processing LGTM.

@naumenkogs
Copy link
Member

Code review ACK e80a80b
Needs rebase.

@vasild vasild force-pushed the only_relay_to_unaware branch from e80a80b to bc45012 Compare September 2, 2020 08:46
@vasild
Copy link
Contributor Author

vasild commented Sep 2, 2020

Rebased to resolve conflicts.

@vasild vasild force-pushed the only_relay_to_unaware branch from bc45012 to fd897f8 Compare September 2, 2020 08:47
@vasild
Copy link
Contributor Author

vasild commented Sep 2, 2020

Picked some suggestions.

@vasild
Copy link
Contributor Author

vasild commented Oct 13, 2020

I gave some more thought on this.

We pick 2 random nodes to relay to regardless of whether those nodes know the address or not. If we happen to pick a node that knows the address we will not relay to it and so we will relay to less than the intended 2 nodes.

The originator of the address knows the address and that is just a special case of the above.

So, assuming address propagation works, maybe this is not an issue. If it is to be addressed then maybe something like the following should be devised:

  • if this is the first time we are relaying a given address within the current 24h window, then pick 2 random destinations from the ones that don't know it
  • if we have already relayed the given address within the current 24h window, then don't relay it

@vasild vasild closed this Oct 13, 2020
@naumenkogs
Copy link
Member

Not sure why this was closed. I think the issue is worth addressing, or at least kept in mind...

So, assuming address propagation works, maybe this is not an issue.

I think it is an issue because it is unexpected behavior. We might think that we always relay to 2 (at least best-effort), but this issue makes it not true, and it's trivial to fix.

I'm wondering what you didn't like about your solution?

@vasild
Copy link
Contributor Author

vasild commented Oct 19, 2020

I closed it because it is fixing just a special case of a more widespread issue. Otherwise I don't think there is a technical issue with the patch. It will be an improvement, thus reopening.

Let me illustrate with an example:

  • let n0 have connections to 8 other nodes: n1, ..., n8
  • n1, n2 and n3 already know some address addr (according to n0)
  • n5 sends addr to n0
  • n5 is immediately marked as knowing addr
  • now nodes n1, n2, n3 and n5 know addr (according to n0)
  • n0 is about to relay it to 2 "random" nodes and picks:
    • in master it would pick among all of n1, ..., n8 and if it happens to choose a node that knows the address, then it will skip relaying to it
    • in this PR it will pick among n1, n2, n3, n4, n6, n7, n8 - an improvement but it could still end up skipping relay if it chooses one of the first 3
    • best way: pick among the nodes that don't know addr: n4, n6, n7, n8. But notice this fix is not trivial - if we implement it naively and relay to e.g. n4 and n6 we will mark them as knowing the address and if after 1 hour we are about to relay the same address again, then we would choose from just n7 and n8. We want to keep our choice constant within 24h.

@vasild vasild reopened this Oct 19, 2020
@naumenkogs
Copy link
Member

@vasild I agree with the description. Maybe once someone implements the "best way" this can be closed, but for now I think it's better keep this as a reference (and maybe merge this as an improvement with 0 overhead).

@sdaftuar
Copy link
Member

sdaftuar commented Dec 3, 2020

utACK (other than the test code, which I did not review). I think this is an improvement even if addr relay can be further improved.

I should add -- I think on balance we should be more concerned about ensuring that honest actors have their addresses relay well, more than we should be concerned about adversaries getting additional relay (I think @naumenkogs made a similar point above). My intuition right now is that it's easy for someone circumventing our addr relay system to get their own addresses to propagate very well, while honest peers spinning up take a long time to get their addresses gossiped on the network. This needs more research though.

@sipa
Copy link
Member

sipa commented Dec 3, 2020

Maybe once someone implements the "best way"

When Erlay for addr messages?

@jonatack
Copy link
Member

jonatack commented Dec 8, 2020

To summarize, if I'm non overlooking anything, this PR has ACKs by @naumenkogs, @sdaftuar, and myself, a fourth ACK (pre-rebase) by @promag, and a concept ACK by @hebasto.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. This seems to be an incremental improvement. I can't think of any downside to doing this.

const CAddress& addr,
/** [in] Whether the address' network is reachable. We relay unreachable addresses less. */
bool fReachable,
/** [in] Connection manager to choose nodes to relay to. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is inconsistent with the style used in the rest of the project. If you want to update the project style for function comments, please open a PR to do that or raise in an irc meeting.

For each address to be relayed we "randomly" pick 2 nodes to send the
address to (in `RelayAddress()`). However we do not take into
consideration that it does not make sense to relay the address back to
its originator (`CNode::PushAddress()` will do nothing in that case).

This means that if the originator is among the "randomly" picked nodes,
then we will relay to one node less than intended.

Fix this by skipping the originating node when choosing candidates to
relay to.
@vasild vasild force-pushed the only_relay_to_unaware branch from 0b8dd95 to 7fabe0f Compare December 10, 2020 13:42
@vasild
Copy link
Contributor Author

vasild commented Dec 10, 2020

Looks like github just lost my comment. @jnewbery did you get it in email?

@vasild vasild force-pushed the only_relay_to_unaware branch from 7fabe0f to e80a80b Compare December 10, 2020 14:24
@vasild
Copy link
Contributor Author

vasild commented Dec 10, 2020

resurrected the comment by rewinding the PR to an old commit (e80a80b)

I agree. This is inconsistent with the style used in the rest of the project. If you want to update the project style for function comments, please open a PR to do that or raise in an irc meeting.

The guidelines already say

Use Doxygen-compatible comment blocks for functions

and the above is (was) doxygen-compatible, so I don't think there is anything to update in developer-notes.md.

The used inline comments have numerous advantages over @param which I will summarize and try to get some feedback to eventually get some explicit "allow that in new code".

Anyway, no need to block this PR with that, so I have changed it to use @param.

@vasild vasild force-pushed the only_relay_to_unaware branch from e80a80b to 7fabe0f Compare December 10, 2020 14:27
@vasild
Copy link
Contributor Author

vasild commented Dec 10, 2020

Changes since the ACKs - replace inline doxygen comments with @param ones:

https://github.com/bitcoin/bitcoin/compare/0b8dd9555c57e64212656a3e801502ce9878bd15..7fabe0f359ae16ed36ce4ca2c33631d038c21448

@sdaftuar
Copy link
Member

ACK 7fabe0f (this time I looked at the test, and verified the test breaks in expected ways if I break the code).

@jonatack
Copy link
Member

Re-reviewing as soon as I finish 19858.

@sdaftuar
Copy link
Member

Not sure what is going on with the failed appveyor or cirrus ci builds; can someone with a better understanding of how those work take a look?

@jnewbery
Copy link
Contributor

utACK 7fabe0f (only net_processing changes. I haven't reviewed the test changes)

@maflcko
Copy link
Member

maflcko commented Dec 10, 2020

(ci can be ignored. Looks like a ci integration problem)

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 7fabe0f per git range-diff b76abae fd897f8 7fabe0f, change since last review is rebase and more readable Doxygen documentation

If you retouch, it might be more readable as

src/net_processing.cpp
- * @param[in] originator The peer that sent us the address. We don't want to relay it back.
- * @param[in] addr Address to relay.
- * @param[in] fReachable Whether the address' network is reachable. We relay unreachable
- * addresses less.
- * @param[in] connman Connection manager to choose nodes to relay to.
+ *
+ * @param[in] originator  The peer that sent us the address. We don't want to relay it back.
+ * @param[in] addr        Address to relay.
+ * @param[in] connman     Connection manager to choose nodes to relay to.
+ * @param[in] fReachable  Whether the address' network is reachable. We relay unreachable addresses less.
  */
-static void RelayAddress(const CNode& originator,
-                         const CAddress& addr,
-                         bool fReachable,
-                         const CConnman& connman)
+static void RelayAddress(const CNode& originator, const CAddress& addr, const CConnman& connman, bool fReachable)
 {
     unsigned int nRelayNodes = fReachable ? 2 : 1; // limited relaying of addresses outside our network(s)
 
@@ -2653,7 +2650,7 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
             if (addr.nTime > nSince && !pfrom.fGetAddr && vAddr.size() <= 10 && addr.IsRoutable())
             {
                 // Relay to a limited number of other nodes
-                RelayAddress(pfrom, addr, fReachable, m_connman);
+                RelayAddress(pfrom, addr, m_connman, fReachable);
             }

@vasild
Copy link
Contributor Author

vasild commented Dec 14, 2020

(ci can be ignored. Looks like a ci integration problem)

Ok, I am leaving it as is then. I can rebase or push a minor whitespace change to trigger a fresh CI run (invalidade ACKs).

@maflcko maflcko merged commit b103fdc into bitcoin:master Dec 14, 2020
@vasild vasild deleted the only_relay_to_unaware branch December 14, 2020 13:06
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Dec 14, 2020
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 21, 2022
Summary:
```
For each address to be relayed we "randomly" pick 2 nodes to send the address to (in `RelayAddress()`). However we do not take into consideration that it does not make sense to relay the address back to its originator (`CNode::PushAddress()` will do nothing in that case).

This means that if the originator is among the "randomly" picked nodes, then we will relay to one node less than intended.

Fix this by skipping the originating node when choosing candidates to relay to.
```

Backport of [[bitcoin/bitcoin#19763 | core#19763]].

Depends on D10859.

Ref T1696.

Test Plan:
  ninja all check-all

Reviewers: #bitcoin_abc, PiRK

Reviewed By: #bitcoin_abc, PiRK

Maniphest Tasks: T1696

Differential Revision: https://reviews.bitcoinabc.org/D10860
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.