Skip to content

Conversation

naumenkogs
Copy link
Member

See issue #28371

Having no peers with sufficiently broad mempool limits may prevent our low-fee transactions from propagating, even though they would seem perfectly valid for us locally.

Having too few such peers may cause privacy leaks.

We should periodically check that we have sufficient peers with mempool limits comparable to ours.
If that's not true, we should evict some high-minFeeRate outbound peers to make room for a new peer with hopefully broader limits.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 15, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot added the P2P label Sep 15, 2023
Copy link
Member

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's as simple as disconnecting from peers with a minfee filter X% larger than ours. A minfee filter X% larger doesn't mean the peer's mempool is X% smaller. Take a simple extreme example: our mempool is filled with 299MB of >10sats/vb txs, and 1MB of 1sats/vb transaction. Out peer sets -maxmempool=299. It advertizes a minfee filter >=10sats/vb. That's at least 10 times ours, but we really shouldn't disconnect it.

What i was thinking about with my suggestion in #28371 was to get a conservative maximum size of the peer's mempool based on their advertized minfee filter and our own mempool. And disconnect it if it's unreasonably small. (Like 80% smaller than our own -maxmempool.)

@naumenkogs
Copy link
Member Author

@darosior m_total_fee / totalTxSize (instead of GetMinFee()) would kinda solve the example you point out, but it's not perfect either. I agree that what you suggest is ideal. I just haven't found a way to apply other peer's minfeerate to our mempool in a reasonable way.
Maybe this:

Every time we consider evicting (say every hour), traverse through our mempool and apply the peer's minfeerate, then see how many transactions are dropped. (A slightly optimized way would be to use a binary search to find at what exact feerate we're at 80% of mempool size — that would be faster, and also will better apply across peers).

Is that roughly what you're thinking about?

@ariard
Copy link

ariard commented Sep 18, 2023

If that's not true, we should evict some high-minFeeRate outbound peers to make room for a new peer with hopefully broader limits.

I think this is coming with the downside of dropping outbound peers, which might be otherwise stable and fast block-relay peers. If an adversary is able to discover our 8 full-relay peers and drop high-feerate conflicting transactions in their mempools, we’ll evict those peers, potentially opening new outbound slots to puppets controlled by the adversary.

I don’t know if a better direction would be to announce the local maxmempool at p2p connection, sounds this is less prone to manipulation by external peers flowing high-feerate transactions in a targeted fashion.

@naumenkogs
Copy link
Member Author

@ariard

If an adversary is able to discover our 8 full-relay peers and drop high-feerate conflicting transactions in their mempools, we’ll evict those peers, potentially opening new outbound slots to puppets controlled by the adversary.

I'm not convinced this is a problem yet. If you assume an attacker can easily take new outbound slots, then our peers might belong to an attacker in the first place... Outbound peer eviction should not be too cheap (say, an attacker should not be able to roll a hundred of peers in a day), but for that I think algorithm can be tuned.

I'm not 100% sure I understand your precise attack scenario. Let's say mempools are equal-size and half empty. What it will take for your attack to force at least one eviction? In fees and capital locked, I guess (dropping the spying effort for now).


I don’t know if a better direction would be to announce the local maxmempool at p2p connection, sounds this is less prone to manipulation by external peers flowing high-feerate transactions in a targeted fashion.

Doesn't it have a similar risk? Feed those peers with high-feerate conflicts and they won't be able to accept txs from my mempool, even though everything seems fine? They won't be dropped, but the relay won't work, at the same cost.

Having no peers with sufficiently broad mempool
limits may prevent our low-fee transactions from
propagating, even though they would seem perfectly
valid for us locally.

Having too few such peers may cause privacy leaks.

We should periodically check that we have sufficient
peers with mempool limits comparable to ours.

If that's not the case, we should evict some high-minFeeRate
outbound peers to make room for a new peer with hopefully
broader limits.
@ariard
Copy link

ariard commented Sep 19, 2023

Outbound peer eviction should not be too cheap (say, an attacker should not be able to roll a hundred of peers in a day), but for that I think algorithm can be tuned.

I think this is where in my mind this PR is affecting the reliability of our outbound peers topology, as CheckForLimitedMempoolAndEvictPeers() will kick out outbound peers with high-feerate every 60s so at max 1440 outbound peers per day. Assuming we have 50k reachable nodes in the network, it will take a month to browse the whole outbound peers space. If an adversary can control a minimal number of peers in this 50k set, it sounds after a bit more than a month our whole outbound peer topology could be controlled. I agree the algorithm can be fine tuned to slow down the peers worst-case roll up.

What it will take for your attack to force at least one eviction?

We assume 1 MB of max mempool size, both locally and for our outbound peers. The adversary broadcasts "set A" 500 parents at 1 sat/vb of size 100 KB (0.5 MB) in our outbound peers mempools, and then simulatenously broadcasts "set B” 500 parents at 1 sat/vb of size 100 KB (0.5 MB), the 500 parents A conflicting with the 500 parents B. Then the adversary broadcast a "set C" of 500 child at 3 sat/vb of size 100 KB (0.5 MB) in our outbound peers mempools, and then simultaneously a "set D" of 500 child at 1 sat/vb of size 100 KB (0.5 MB).

When the adversary broadcast one more transaction at 1 sat sat/vb, our outbound peers mempool min fee should jump to 2 sat/vb (descendant score of set A txn) and our local mempool min fee should jump to 1 sat/vb (TrimToSize()), our outbound peers will be disconnected due to high minFeeRate. Fees cost for the adfversary is in the order of 0.02 BTC if my maths are correct.

I agree the cost for the adversary is significant if you assume outbound peers are using default setting (300 MB so 6 BTC), and might not be opportun unless you're eclipsing a high-loaded Lightning node's chain view. Still the interdependency if added, doesn't sounds great.

Doesn't it have a similar risk? Feed those peers with high-feerate conflicts and they won't be able to accept txs from my mempool, even though everything seems fine?

With this direction, you would only disconnect based on the maxmempool announced, not the seen fee filters. Of course, there is similar risk of transaction propagation issue interference, though at least you're not loosing full-relay peers that are good for the honesty of your chain view, I think.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

What about triggering this only if it is needed - if we have such a low fee transaction of ours that we want to broadcast? Otherwise it looks like we will be kicking peers unnecessary (a low fee local tx may never be created).

Could this partition the network, having big mempools on one side and small mempools on the other side?

Comment on lines +5214 to +5217
// Ideally we want to compare mempool sizes, not fee filters.
// Otherwise we easily get confused: e.g. at empty mempools this is less
// critical, but that'd be impossible to account for.
const auto our_min_feerate = WITH_LOCK(m_mempool.cs, return m_mempool.GetMinFee());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it why it is better to compare mempool sizes.

  • If the peer's minfee is larger than the tx fee, then the tx will be dropped by that peer (irrelevant of their mempool size)
  • If the peer's minfee is smaller than the tx fee, then the tx will be accepted by that peer (irrelevant of their mempool size).

No? I see how the mempool size affects minfee, but is it not minfee what we care about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I realize that the assumption here is that we don't have yet a problematic low-fee local tx that we cannot broadcast. In that case, yes, I agree that the best approach is to compare the sizes of the mempools.

Comment on lines +5228 to +5229
if (tx_relay->m_fee_filter_received) {
peer_fee_filters.push_back(std::make_pair(pnode, tx_relay->m_fee_filter_received.load()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a boolean and got me confused "if fee filter has been received". It is actually an integer. When dealing with atomics it is better to read once into a local variable and later refer the local variable to avoid the value changing in the meantime.

Suggested change
if (tx_relay->m_fee_filter_received) {
peer_fee_filters.push_back(std::make_pair(pnode, tx_relay->m_fee_filter_received.load()));
const auto minfee = tx_relay->m_fee_filter_received.load();
if (minfee > 0) {
peer_fee_filters.push_back(std::make_pair(pnode, minfee));


if (peer_fee_filters.size() == 0) return;

std::sort(peer_fee_filters.begin(), peer_fee_filters.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sorting by the CNode* pointers instead of the fees? https://en.cppreference.com/w/cpp/utility/pair/operator_cmp
Looks like it would only compare the fees if the pointers are equal.

If std::map<CAmount, CNode*> is used, then this std::sort can be dropped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to sort this only if our threshold is met? It seems that the only reason why this is sorted is so we can pick the highest fee filter, but we only need this if (around_our_percent_fee_filter < 4)

std::sort(peer_fee_filters.begin(), peer_fee_filters.end());

for (auto [node_id, fee_filter] : peer_fee_filters) {
if (fee_filter == 0) ++around_our_percent_fee_filter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be dropped, only elements with != are added to peer_fee_filter.

@instagibbs
Copy link
Member

Maybe we want to look at peers that aren't announcing a lot of INV relative to other peers of the same type and use it as some sort of apples-to-apples tie breaker? I'd be extremely careful of self-partitioning if we've already found what we believe to be useful outbound peers for blocks.

@naumenkogs
Copy link
Member Author

@ariard

With this direction, you would only disconnect based on the maxmempool announced, not the seen fee filters. Of course, there is a similar risk of transaction propagation issue interference, though at least you're not losing full-relay peers that are good for the honesty of your chain view.

My point was that even though you continue to receive blocks, the same exact cost (6 BTC or whatever) would apply to censoring transactions (through spamming peer's mempool) seamlessly. However, with my patch, an attacker has to repeat it until all outbound connections are occupied. With your patch, it's a one-time cost (assuming no other fix is applied). Right?
I agree we should continue comparing these two solutions.


@vasild

What about triggering this only if it is needed - if we have such a low-fee transaction

With time-sensitive transactions, I think you don't want to deal with it at the last moment. Although you're right, we should be careful with adding structure to the network. That's why my initial policy was 4 peers should have a similar mempool, but I'm open to discussing it.


@instagibbs

That would be partitioning based on latency instead of just mempool size... Is it any better?

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that even though you continue to receive blocks, the same exact cost (6 BTC > or whatever) would apply to censoring transactions (through spamming peer's mempool)
seamlessly. However, with my patch, an attacker has to repeat it until all outbound
connections are occupied. With your patch, it's a one-time cost (assuming no other fix is
applied). Right?
I agree we should continue comparing these two solutions.

So I think mempool partitioning attack is a sufficient concern to weigh in the design of outbound peers eviction based on low-mempool limits. From my understanding you have few ways to implement it:

  • current solution: on the received bip133 feefilter and some margin
  • proposed solution 1: extend p2p protocol to announce mempool size (i.e make maxmempoolsize) a p2p parameter
  • proposed solution 2 (instagibbs) iiuc: based on the INV traffic received or as a tie-breaker only about current solution

I think all solutions can be targeted by mempool partitioning attack, the difference between the first solution (feefilter-based) and the second one (maxmempoolsize) sounds low-limit mempool (and as such higher feefilter) sounds cheaper in satoshis amount has to locked up in the mempool. It sounds the same one-time cost or repeating the partitioning the peer for each full-outbound peer, as here the adversary constraint is the real value in satoshi that has to be locked up no, it can be reused for each partition ?

Beyond Lightning nodes and time-sensitive transactions, I think miners mempools can be a source of concern, they need to receive new blocks while at the same time be informed of the best transaction feerate. If all their transaction-relay peers can be hijacked based on mempool-limits eviction, I believe you can downgrade their marginal income.

size_t around_our_percent_fee_filter = 0;

m_connman.ForEachNode([&](CNode* pnode) {
if (!pnode->IsFullOutboundConn() || pnode->fDisconnect) return;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have at max 2 block-relay-only connections (MAX_BLOCK_RELAY_ONLY_CONNECTIONS) and 8 full-relay connections, whatever of the eviction frequency and fee differential that can be adopted as parameters to known when to severe an outbound connection, I think it would be good to have more than 2 outbound block-relay peers strictly protected.

One protection heuristic I believe can be to pick-up randomly 2 more IsFullOutboundConn() on their sockets and mark them as protected from CheckForLimitedMempoolAndEvictPeers.


if (auto tx_relay = peer->GetTxRelay(); tx_relay != nullptr) {
if (tx_relay->m_fee_filter_received) {
peer_fee_filters.push_back(std::make_pair(pnode, tx_relay->m_fee_filter_received.load()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure you need to call pnode->AddRef() before storing it in the vector and Release() at the end of CheckForLimitedMempoolAndEvictPeers for each CNode you stored. This can otherwise result in use-after-free, when a CNode is disconnected and deleted from a different thread after the ForEachNode loop.

I think in practice this won't happen because cs_main is locked but that looks accidental.

size_t around_our_percent_fee_filter = 0;

m_connman.ForEachNode([&](CNode* pnode) {
if (!pnode->IsFullOutboundConn() || pnode->fDisconnect) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disconnection is already filtered in ForEachNode

Suggested change
if (!pnode->IsFullOutboundConn() || pnode->fDisconnect) return;
if (!pnode->IsFullOutboundConn()) return;

Comment on lines +31 to +33
def mock_forward(self, delta):
self.mock_time += delta
self.nodes[0].setmocktime(self.mock_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #28199 we have bumpmocktime, so you can just call self.node[0].bumpmocktime(delta).

@@ -5194,6 +5197,62 @@ void PeerManagerImpl::CheckForStaleTipAndEvictPeers()
}
}

void PeerManagerImpl::CheckForLimitedMempoolAndEvictPeers()
{
LOCK(cs_main);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is cs_main locked for all of this?

* transactions constructed according to our mempool. Otherwise, the lower-fee transactions
* will never be relayed to the network.
*/
virtual void CheckForLimitedMempoolAndEvictPeers() = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to be part of PeerManager's public interface (unless you add unit tests).

size_t around_our_percent_fee_filter = 0;

m_connman.ForEachNode([&](CNode* pnode) {
if (!pnode->IsFullOutboundConn() || pnode->fDisconnect) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pnode->fDisconnect shouldn't be needed here. CConnman::ForEachNode already filters based on CConnman::NodeFullyConnected, which checks !pnode->fDisconnect


if (peer_fee_filters.size() == 0) return;

std::sort(peer_fee_filters.begin(), peer_fee_filters.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to sort this only if our threshold is met? It seems that the only reason why this is sorted is so we can pick the highest fee filter, but we only need this if (around_our_percent_fee_filter < 4)


if (auto tx_relay = peer->GetTxRelay(); tx_relay != nullptr) {
if (tx_relay->m_fee_filter_received) {
peer_fee_filters.push_back(std::make_pair(pnode, tx_relay->m_fee_filter_received.load()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to store this in a vector and loop over it later? If the goal is to evict a single peer, can we not just perform the comparisons here?

Instead of storing every <CNode*, CAmount> pair we just store the one with the highest filter, starting with the first we see and comparing any other that matches our criteria. That way we don't have to:

  • Loop twice
  • Store all the relevant pairs
  • Sort the vector to pick the highest fee filter peer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given @vasild's comment (https://github.com/bitcoin/bitcoin/pull/28488/files#r1331255189) this should just be replacing the current highest fee filter, or if (CFeeRate{fee_filter} <= our_min_feerate) ++around_our_percent_fee_filter

@@ -5194,6 +5197,62 @@ void PeerManagerImpl::CheckForStaleTipAndEvictPeers()
}
}

void PeerManagerImpl::CheckForLimitedMempoolAndEvictPeers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how would this interfere with #28538 and whether this logic should be part of CConnman::GetExtraFullOutboundCount instead.

Comment on lines +39 to +41
peers = []
for id in range(8):
peers.append(node.add_outbound_p2p_connection(P2PDataStore(), p2p_idx=id, connection_type="outbound-full-relay"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can simply inline this:

Suggested change
peers = []
for id in range(8):
peers.append(node.add_outbound_p2p_connection(P2PDataStore(), p2p_idx=id, connection_type="outbound-full-relay"))
peers = [node.add_outbound_p2p_connection(P2PDataStore(), p2p_idx=id, connection_type="outbound-full-relay") for id in range(8)]

Comment on lines +6 to +23
import time

from test_framework.blocktools import (
create_block,
create_coinbase,
)
from test_framework.messages import (
msg_pong,
msg_tx,
msg_feefilter,
)
from test_framework.p2p import (
P2PDataStore,
P2PInterface,
)
from test_framework.test_framework import BitcoinTestFramework
from test_framework.util import assert_equal, check_node_connections
from test_framework.wallet import MiniWallet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these are not being used:

  • None from blocktools
  • msg_pong, msg_tx from messages
  • P2PInterface from p2p
  • MiniWallet from wallet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code path of not reaching the limit of outbound connections but reaching the threshold of peer over the fee filter limit is not being tested (i.e. GetExtraFullOutboundCount > 0 and around_our_percent_fee_filter < 4)

peers[4].send_and_ping(msg_feefilter(10))
node.mockscheduler(60)
self.mock_forward(60)
self.wait_until(lambda: len(self.nodes[0].getpeerinfo()) == 7, timeout=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to be opinionated about what peer is being evicted (the one with the most restrictive filter) it may be worth setting up the test so we can check that is the case.

Comment on lines +53 to +56
# Set one more filter, but make it borderline acceptable.
peers[4].send_and_ping(msg_feefilter(81))
node.mockscheduler(60)
check_node_connections(node=node, num_in=0, num_out=8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong. Setting the filter for the fifth node is already triggering the eviction. You can check by just adding self.wait_until(lambda: len(self.nodes[0].getpeerinfo()) == 7, timeout=10) here. Looks like there is a race between the peer being marked for disconnection and the check being performed.

This is mainly because setting the filter to one additional peer makes the threshold to be met. Also, I don't think 81 is borderline acceptable, that is actually the highest filter you have set so far.

@instagibbs
Copy link
Member

instagibbs commented Oct 12, 2023

That would be partitioning based on latency instead of just mempool size... Is it any better?

I'm not sure I get the question. We should think very carefully about messing with outbounds that are otherwise deemed important for avoiding chainsplit reasons

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have some kind of limiting factor (maybe by fee estimation for a reasonably large target?) so that if someone decides to run with a much larger mempool size than the default, they don't keep on dropping outbound peers that run with the default. Otherwise they could spend a long time circling through outbounds until they have found enough peers with a mempool of a similar size than theirs.

@vasild
Copy link
Contributor

vasild commented Oct 13, 2023

I second @mzumsande above, plus I see this as too aggressive change that will affect all nodes whereas few will benefit. In other words, this is not needed if a node does not have problems propagating its transactions. For such problems to happen a node must be running with larger than the default mempool or all its peers with smaller than default. Further, if the node's mempool is much larger than default then it may never find a peer with a comparable mempool, so it may loop through peers needlessly while such low fee transactions have not been created yet (and may never be). So this only helps if the node's mempool is larger, but not too large and it creates very low fee transactions.

I am not saying that this should be dropped, but does it make sense to enable it conditionally, under a configuration parameter? With a suggestion to turn it on if the mempool is larger than default and very low fee, time sensitive transactions are expected to be generated.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 5, 2024

🤔 There hasn't been much activity lately and the CI seems to be failing.

If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 2, 2025

Are you still working on this?

@fanquake
Copy link
Member

Closing for now, due to no followup. Please leave a comment, if you want this reopened.

@fanquake fanquake closed this Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.