Skip to content

Conversation

sr-gi
Copy link
Member

@sr-gi sr-gi commented Jul 25, 2023

Rationale

Currently, addnode has a couple of corner cases that allow it to either connect to the same peer more than once, hence wasting outbound connection slots, or add redundant information to m_added_nodes, hence making Bitcoin iterate through useless data on a regular basis.

Connecting to the same node more than once

In general, connecting to the same node more than once is something we should try to prevent. Currently, this is possible via addnode in two different ways:

  1. Calling addnode more than once in a short time period, using two equivalent but distinct addresses
  2. Calling addnode add using an IP, and addnode onetry after with an address that resolved to the same IP

For the former, the issue boils down to CConnman::ThreadOpenAddedConnections calling CConnman::GetAddedNodeInfo once, and iterating over the result to open connections (CConman::OpenNetworkConnection) on the same loop for all addresses.CConnman::ConnectNode only checks a single address, at random, when resolving from a hostname, and uses it to check whether we are already connected to it.

An example to test this would be calling:

bitcoin-cli addnode "127.0.0.1:port" add
bitcoin-cli addnode "localhost:port" add

And check how it allows us to perform both connections some times, and some times it fails.

The latter boils down to the same issue, but takes advantage of onetry bypassing the CConnman::ThreadOpenAddedConnections logic and calling CConnman::OpenNetworkConnection straightaway. A way to test this would be:

bitcoin-cli addnode "127.0.0.1:port" add
bitcoin-cli addnode "localhost:port" onetry

Adding the same peer with two different, yet equivalent, addresses

The current implementation of addnode is pretty naive when checking what data is added to m_added_nodes. Given the collection stores strings, the checks at CConnman::AddNode() basically check wether the exact provided string is already in the collection. If so, the data is rejected, otherwise, it is accepted. However, ips can be formatted in several ways that would bypass those checks.

Two examples would be 127.0.0.1 being equal to 127.1 and [::1] being equal to [0:0:0:0:0:0:0:1]. Adding any pair of these will be allowed by the rpc command, and both will be reported as connected by getaddednodeinfo, given they map to the same CService.

This is less severe than the previous issue, since even tough both nodes are reported as connected by getaddednodeinfo, there is only a single connection to them (as properly reported by getpeerinfo). However, this adds redundant data to m_added_nodes, which is undesirable.

Parametrize CConnman::GetAddedNodeInfo

Finally, this PR also parametrizes CConnman::GetAddedNodeInfo so it returns either all added nodes info, or only info about the nodes we are not connected to. This method is used both for rpc, in getaddednodeinfo, in which we are reporting all data to the user, so the former applies, and to check what nodes we are not connected to, in CConnman::ThreadOpenAddedConnections, in which we are currently returning more data than needed and then actively filtering using CService.fConnected()

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 25, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK jonatack, kashifs, mzumsande
Stale ACK vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28248 (p2p: peer connection bug fixes by jonatack)
  • #26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sr-gi
Copy link
Member Author

sr-gi commented Jul 25, 2023

I've purposely split this into three commits according to each of the fixes so they can be discussed, and considered independently. That way it'll be easier to remove any if the change does not reach enough consensus.

@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch from ad3fee2 to 2110ac1 Compare July 25, 2023 21:03
@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch from 2110ac1 to 866bc36 Compare July 26, 2023 15:24
@kevkevinpal
Copy link
Contributor

would it make sense to add a functional test like the below under

ip_port = "localhost:[]".format(p2p_port='add')
self.nodes[0].addnode(node=ip_port, command='add')

in test/functional/rpc_net.py on line 212
because we shortly after assert that there is only 1 added node added

@sr-gi
Copy link
Member Author

sr-gi commented Jul 28, 2023

would it make sense to add a functional test like the below under

ip_port = "localhost:[]".format(p2p_port='add')
self.nodes[0].addnode(node=ip_port, command='add')

in test/functional/rpc_net.py on line 212 because we shortly after assert that there is only 1 added node added

I don't think that'd work. You'll still be able to add two nodes that resolve to the same IP using addnode as long as they belong to different namespaces. Lookups are not performed. The difference though is that both connections won't be established. However, the connection is not being checked by the test.

What could be added, however, is a check for 127.0.0.1 and 127.1.

PS: This actually made me realize that the check for whether to add something to m_added_nodes was not good enough.

@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch 2 times, most recently from 6304950 to b5763bc Compare July 28, 2023 17:23
@sr-gi
Copy link
Member Author

sr-gi commented Jul 28, 2023

Added a test to test/functional/rpc_net.py and fixed the check on CConnman::AddNode

@jonatack
Copy link
Member

jonatack commented Aug 12, 2023

Concept ACK. I've been looking at this code lately as well, e.g. bug fixes (5ae9fe1 / c9dbf0b), logic (4820f7f) and logging (d6b0da5) improvements and agree with the issues you report. Will review soon.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK a6fcf17

@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch from a6fcf17 to d0e7e18 Compare August 23, 2023 08:39
@sr-gi
Copy link
Member Author

sr-gi commented Aug 23, 2023

Addressed @vasild comments

@DrahtBot DrahtBot mentioned this pull request Aug 24, 2023
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d0e7e18

Would be nice to print the hostname here: #28155 (comment)

@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch from d0e7e18 to fd8b1db Compare August 25, 2023 13:47
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fd8b1db

for initial partial unit test coverage of these CConnman class methods:

- AddNode()
- ConnectNode()
- GetAddedNodeInfo()
- AlreadyConnectedToAddress()
- ThreadOpenAddedConnections()

and of the GetAddedNodeInfo() call in RPC addnode.
@sr-gi sr-gi force-pushed the 202307-addednodeinfo branch from fc1e9e4 to 0420f99 Compare October 31, 2023 17:37
@jonatack
Copy link
Member

re-ACK 0420f99

@kashifs
Copy link
Contributor

kashifs commented Nov 2, 2023

tACK 0420f9

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested ACK 0420f99

@sr-gi
Copy link
Member Author

sr-gi commented Nov 3, 2023

tACK 0420f9

Thanks for reviewing! Would you mind sharing what did you test and how?

@vasild would you mind re-acking this when you have some time?

@kashifs
Copy link
Contributor

kashifs commented Nov 4, 2023

tACK 0420f9

Thanks for reviewing! Would you mind sharing what did you test and how?

@vasild would you mind re-acking this when you have some time?

I pulled the branch, compiled from source on my Mac, and ran:

./src/test/test_bitcoin -t net_peer_connection_tests -l message -- -printtoconsole=1

and

./src/test/test_bitcoin -t net_peer_connection_tests -l all -- -printtoconsole=1

I also pored over every line of modified code in order to better understand the design.

Is there anything else that I should do that might be helpful?

@sr-gi
Copy link
Member Author

sr-gi commented Nov 4, 2023

tACK 0420f9

Thanks for reviewing! Would you mind sharing what did you test and how?

@vasild would you mind re-acking this when you have some time?

I pulled the branch, compiled from source on my Mac, and ran:

./src/test/test_bitcoin -t net_peer_connection_tests -l message -- -printtoconsole=1

and

./src/test/test_bitcoin -t net_peer_connection_tests -l all -- -printtoconsole=1

I also pored over every line of modified code in order to better understand the design.

Is there anything else that I should do that might be helpful?

Nope, that's good. Thanks again for reviewing and testing.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0420f99

Comment on lines +105 to +114
CNode* ConnmanTestMsg::ConnectNodePublic(PeerManager& peerman, const char* pszDest, ConnectionType conn_type)
{
CNode* node = ConnectNode(CAddress{}, pszDest, /*fCountFailure=*/false, conn_type, /*use_v2transport=*/true);
if (!node) return nullptr;
node->SetCommonVersion(PROTOCOL_VERSION);
peerman.InitializeNode(*node, ServiceFlags(NODE_NETWORK | NODE_WITNESS));
node->fSuccessfullyConnected = true;
AddTestNode(*node);
return node;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, naming: maybe a different name would have been better than ConnectNodePublic(), given that it contains extra logic on top of the private ConnectNode().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I'm touching that code for #28248, if you have a naming suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno, maybe ConnectNodeAndInitialize()?

achow101 added a commit that referenced this pull request Apr 25, 2024
… connecting to a node

fd81a37 net: attempts to connect to all resolved addresses when connecting to a node (Sergi Delgado Segura)

Pull request description:

  This is a follow-up of #28155 motivated by #28155 (comment)

  ## Rationale

  Prior to this, when establishing a network connection via `CConnman::ConnectNode`, if the connection needed address resolution, a single address would be picked at random from the resolved addresses and our node would try to connect to it. However, this would lead to the behavior of `ConnectNode` being unpredictable when the address was resolved to various ips (e.g. the address resolving to IPv4 and IPv6, but we only support one of them).

  This patches the aforementioned behavior by going over all resolved IPs until a valid one is found or until we
  exhaust them.

ACKs for top commit:
  mzumsande:
    re-ACK fd81a37 (just looked at diff, only small logging change)
  achow101:
    ACK fd81a37
  vasild:
    ACK fd81a37

Tree-SHA512: fa1ebc5c84fe61dd0a7fe1113ae2d594a75ad661c43ed8984a31fc9bc50f166b2759b0d8d84ee5dc247691eff78c8156fac970af797bbcbf67492eec0353fb58
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request May 13, 2024
… one

The current `addnode` rpc command has some edge cases in where it is possible to
connect to the same node twice by combining ip and address requests. This can happen under two situations:

The two commands are run one right after each other, in which case they will be processed
under the same loop in `CConnman::ThreadOpenAddedConnections` without refreshing `vInfo`, so both
will go trough. An example of this would be:

```
bitcoin-cli addnode "localhost:port" add

```

A node is added by IP using `addnode "add"` while the other is added by name using
`addnode "onetry"` with an address that resolves to multiple IPs. In this case, we currently
only check one of the resolved IPs (picked at random), instead of all the resolved ones, meaning
this will only probabilistically fail/succeed. An example of this would be:

```
bitcoin-cli addnode "127.0.0.1:port" add
[...]
bitcoin-cli addnode "localhost:port" onetry
```

Both cases can be fixed by iterating over all resolved addresses in `CConnman::ConnectNode` instead
of picking one at random

Github-Pull: bitcoin#28155
Rebased-From: 2574b7e
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request May 13, 2024
…ently

Currently it is possible to add the same node twice when formatting IPs in
different, yet equivalent, manner. This applies to both ipv4 and ipv6, e.g:

127.0.0.1 = 127.1 | [::1] = [0:0:0:0:0:0:0:1]

`addnode` will accept both and display both as connected (given they translate to
the same IP). This will not result in multiple connections to the same node, but
will report redundant info when querying `getaddednodeinfo` and populate `m_added_nodes`
with redundant data.

This can be avoided performing comparing the contents of `m_added_addr` and the address
to be added as `CServices` instead of as strings.

Github-Pull: bitcoin#28155
Rebased-From: 94e8882
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request May 13, 2024
…nected address on request

`CConnman::GetAddedNodeInfo` is used both to get a list of addresses to manually connect to
in `CConnman::ThreadOpenAddedConnections`, and to report about manually added connections in
`getaddednodeinfo`. In both cases, all addresses added to `m_added_nodes` are returned, however
the nodes we are already connected to are only relevant to the latter, in the former they are
actively discarded.

Parametrizes `CConnman::GetAddedNodeInfo` so we can ask for only addresses we are not connected to,
to avoid passing useless information around.

Github-Pull: bitcoin#28155
Rebased-From: 34b9ef4
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request May 13, 2024
that are otherwise private:
- CConnman::m_nodes
- CConnman::ConnectNodes()
- CConnman::AlreadyConnectedToAddress()

and update the #include headers per iwyu.

Github-Pull: bitcoin#28155
Rebased-From: 4b834f6
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request May 13, 2024
for initial partial unit test coverage of these CConnman class methods:

- AddNode()
- ConnectNode()
- GetAddedNodeInfo()
- AlreadyConnectedToAddress()
- ThreadOpenAddedConnections()

and of the GetAddedNodeInfo() call in RPC addnode.

Github-Pull: bitcoin#28155
Rebased-From: 0420f99
kwvg added a commit to kwvg/dash that referenced this pull request Oct 26, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 26, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 26, 2024
kwvg added a commit to kwvg/dash that referenced this pull request Oct 26, 2024
PastaPastaPasta added a commit to dashpay/dash that referenced this pull request Oct 27, 2024
, bitcoin#27213, bitcoin#28189, bitcoin#28155, bitcoin#28895, partial bitcoin#26396 (networking backports: part 9)

09504bd merge bitcoin#28895: do not make automatic outbound connections to addnode peers (Kittywhiskers Van Gogh)
6cf206c merge bitcoin#28155: improves addnode / m_added_nodes logic (Kittywhiskers Van Gogh)
11d654a merge bitcoin#28189: diversify network outbounds release note (Kittywhiskers Van Gogh)
5dc52b3 merge bitcoin#27213: Diversify automatic outbound connections with respect to networks (Kittywhiskers Van Gogh)
291305b merge bitcoin#26497: Make ConsumeNetAddr always produce valid onion addresses (Kittywhiskers Van Gogh)
6a37934 partial bitcoin#26396: Avoid SetTxRelay for feeler connections (Kittywhiskers Van Gogh)
221a78e merge bitcoin#25156: Introduce PeerManagerImpl::RejectIncomingTxs (Kittywhiskers Van Gogh)
cc694c2 merge bitcoin#22778: Reduce resource usage for inbound block-relay-only connections (Kittywhiskers Van Gogh)
6e6de54 net: use `Peer::m_can_relay_tx` when we mean `m_tx_relay != nullptr` (Kittywhiskers Van Gogh)
77526d2 net: rename `Peer::m_block_relay_only` to `Peer::m_can_tx_relay` (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Dependency for #6333
  * When backporting [bitcoin#22778](bitcoin#22778), the `m_tx_inventory_known_filter.insert()` call in  `queueAndMaybePushInv()` had to be moved out as `EXCLUSIVE_LOCKS_REQUIRED` does not seem to work with lambda parameters list (though it does work with capture list, [source](https://github.com/dashpay/dash/blob/4b5e39290c8f1c78305e597312408c84e2b06b64/src/net_processing.cpp#L5781)), see error below

    <details>

    <summary>Compiler error:</summary>

    ```
    net_processing.cpp:5895:21: note: found near match 'tx_relay->m_tx_inventory_mutex'
    net_processing.cpp:5955:21: error: calling function 'operator()' requires holding mutex 'queueAndMaybePushInv.m_tx_inventory_mutex' exclusively [-Werror,-Wthread-safety-precise]
                        queueAndMaybePushInv(tx_relay, CInv(nInvType, hash));
                        ^
    net_processing.cpp:5955:21: note: found near match 'tx_relay->m_tx_inventory_mutex'
    net_processing.cpp:5977:17: error: calling function 'operator()' requires holding mutex 'queueAndMaybePushInv.m_tx_inventory_mutex' exclusively [-Werror,-Wthread-safety-precise]
                    queueAndMaybePushInv(inv_relay, inv);
                    ^
    ```

    </details>

    * Attempting to remove the `EXCLUSIVE_LOCKS_REQUIRED` or the `AssertLockHeld` are not options due to stricter thread sanitization checks being applied since [dash#6319](#6319) (and in general, removing annotations being inadvisable regardless)

    * We cannot simply lock `peer->GetInvRelay()->m_tx_inventory_mutex` as a) the caller already has a copy of the relay pointer (which means that `m_tx_relay_mutex` was already held) that b) they used to lock `m_tx_inventory_mutex` (which we were already asserting) so c) we cannot simply re-lock `m_tx_inventory_mutex` as it's already already held, just through a less circuitous path.

      * The reason locking is mentioned instead of asserting is that the compiler treats `peer->GetInvRelay()->m_tx_inventory_mutex` and `tx_relay->m_tx_inventory_mutex` as separate locks (despite being different ways of accessing the same thing) and would complain similarly to the error above if attempting to assert the former while holding the latter.

  * As `m_tx_relay` is always initialized for Dash, to mimic the behaviour _expected_ upstream, in [bitcoin#22778](bitcoin#22778), `SetTxRelay()` will _allow_ `GetTxRelay()` to return `m_can_tx_relay` (while Dash code that demands unconditional access can use `GetInvRelay()` instead) with the lack of `SetTxRelay()` resulting in `GetTxRelay()` feigning the absence of `m_can_tx_relay`.

    This allows us to retain the same style of checks used upstream instead of using proxies like `!CNode::IsBlockOnlyConn()` to determined if we _should_ use `m_tx_relay`.

  * Speaking of proxies, being a block-only connection is now no longer the only reason not to relay transactions

    In preparation for [bitcoin#26396](bitcoin#26396), proxy usage of `!CNode::IsBlockOnlyConn()` and `!Peer::m_block_relay_only` was replaced with the more explicit `Peer::m_can_tx_relay` before being used to gate the result of `GetTxRelay()`, to help it mimic upstream semantics.

  ## Breaking Changes

  None expected

  ## Checklist

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  UdjinM6:
    utACK 09504bd

Tree-SHA512: f4f36f12f749b697dd4ad5521ed15f862c93ed4492a047759554aa80a3ce00dbd1bdc0242f7a4468f41f25925d5b79c8ab774d8489317437b1983f0a1277eecb
@bitcoin bitcoin locked and limited conversation to collaborators Nov 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants