p2p: Stop relaying non-mempool txs, improve tx privacy slightly #17303

maflcko · 2019-10-29T18:49:04Z

There is no reason to relay transactions that have been, but are no longer, in the mempool at the time of the getdata request for the transaction.

Possible reasons for mempool removal, and my thinking why it is safe to not relay

Reason		Rationale
`EXPIRY`	Expired from mempool	Mempool expiry is 2 weeks, so does not interfere with relay at all.
`SIZELIMIT`	Removed in size limiting	A low fee transaction, which will be relayed by a different peer after `GETDATA_TX_INTERVAL`. Assuming it ever made it to another peer.
`REORG`	Removed for reorganization	Block races are rare, so reorgs should be rarer. Also the tx will be reaccepted from the `disconnectpool` later on and then relayed (TODO), so this should never be observable in practise.
`BLOCK`	Removed for block	The other peer will eventually receive the (filtered)block with the tx in it.
`CONFLICT`	Removed for conflict with in-block transaction	The peer won't be able to add the tx to the mempool anyway. No need to waste bandwidth.
`REPLACED`	Removed for replacement	Same as above, send them the higher fee tx instead.

This should also improve transaction relay privacy slightly. Previously any peer (even inbounds) could ask for transactions in mapRelay, even if they were never announced to them via an inv message. After my changes, the peer can only request transactions that were added to the mempool before the last inv message was sent to that peer.

sdaftuar · 2019-10-29T19:23:28Z

I think this is the eventual right direction for us to go (because mapRelay doesn't have great memory bounds), but without something like #15505 this turns nodes into accidental DoSers whenever they evict a transaction after announcement and before a getdata comes in. If you're a node running with a small mempool, then transactions can be evicted even if they're not "low feerate".

I tend to think we should not be turning nodes with small mempools into DoSers; so if we solve that problem first (eg by making the peer receiving a NOTFOUND be able to find the transaction elsewhere in a reasonable amount of time), then I think we could go ahead and remove mapRelay and only serve things from our mempool. We should go ahead and fix the privacy problem (mentioned in #14220) at the same time though.

maflcko · 2019-10-29T20:12:47Z

We should go ahead and fix the privacy problem (mentioned in #14220) at the same time though.

It is trivial to fix that now, but I refuse to write code that does time comparisons that don't use std::chrono. Will push the commit after #17243 .

DrahtBot · 2019-10-29T22:52:17Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

Do not answer GETDATA for to-be-announced tx #18861 (Do not answer GETDATA for to-be-announced tx by sipa)
[net processing] Drop unknown types in getdata #18808 ([net processing] Drop unknown types in getdata by jnewbery)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

naumenkogs · 2019-11-04T19:06:46Z

This sounds reasonable to me, but I think we should be careful here to not introduce vulns as a side-effect.
So I'd want to give a more thorough review once all the pre-reqs mentioned above are satisfied.

maflcko · 2019-11-06T18:38:01Z

Minimized the diff. Also fixed the privacy issue as requested by @sdaftuar

naumenkogs · 2019-11-06T19:19:03Z

Summary

I think the privacy problem mentioned above is fixed here in a limited way.
There are 2 privacy threats related to the discussed behavior right now:
a) inferring an origin node of the particular transaction
b) inferring tx relay network topology

This proposal fixes a), but does not fix b), see below.
I also show that ideas of fixing b) after this change would require bringing back mapRelay in another form.

New attack

We will not respond if something wasn't INVed, right. But imagine flooding the network with tx_A, so that everybody announces is to an attacker and an attacker now can query for it.
Then in 30 seconds, when everyone does have tx_A, we selectively announce tx_B (which is an RBF to tx_A and will evict tx_A from the mempool) to node_victim. And start querying for tx_A persistently. Whoever responds with NOTFOUND received tx_B, which means that node was closer to node_victim.

Stuff like diffusion, previous PR addressing the problem and this proposal would make it difficult to infer tx_A or tx_B relay paths directly, but they can easily be deduced by querying for tx_A.

So, the cost of an attack is increased from 1 tx (or even 0 if using someone's txs) to 1 tx + 1 RBF, which does not really address the goal.

Ideas

Crazy expensive but fundamental

Keep responding about tx_A even when it's replaced with tx_B.
This would require storing tx_A while tx_B is still in the mempool. Afaik this is very opposite to the current intuition we implement in the codebase.

Less expensive but limited

Allow at most 1 (or 2 just in case) queries per transaction. So that tx_A can be requested by every node at most once.
In this case, an attacker would have to run more nodes or spend more double-spends. Not ideal, but at least something.
It would also require a data structure like a bloom filter to track our past responses.

Maybe this should be implemented as a follow-up PR, I might do that sometime.

jkczyz · 2019-11-12T20:47:19Z

src/net_processing.cpp

@@ -1554,20 +1554,17 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
            const CInv &inv = *it;
            it++;

-            // Send stream from relay memory
            bool push = false;


You can now remove push and then update vNotFound in an else branch opposite the PushMessage below.

Happy to remove in a new commit if I get at least one ACK and others agree to do this.

push is used in #18038, so to avoid a merge conflict I will leave it as is for now. Whichever pull gets merged last can address this nit.

jkczyz · 2019-11-12T20:47:33Z

src/net_processing.cpp

-                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second));
-                push = true;
-            } else {
+            {


Is this scope needed?

no, but it keeps the scope that came previously after the else to minimize the diff. Happy to remove in a new commit if I get at least one ACK and others agree to do this.

jkczyz · 2019-11-13T18:59:26Z

The PR title states that this change "improve tx privacy slightly". Could you elaborate in the PR description how? (i.e., why was it less private before?)

jnewbery · 2019-11-13T21:27:21Z

Concept ACK

@sdaftuar
You had two concept comments on this PR:

If we make this change, then it should address the privacy issue identified in Transaction relay privacy bugfix #14220. I haven't fully reviewed the second commit yet, but it seems like a reasonable fix (keep a per-peer timestamp of the last tx INV we sent, don't respond to GETDATAs for txs that entered our mempool after that timestamp).
This PR might exacerbate the problem in p2p: Request NOTFOUND transactions immediately from other outbound peers, when possible #15505 and we could stall relay for txs that are removed from our mempool for SIZELIMIT. That seems to me to be a very minor edge condition for transactions that are at the bottom of the mempool. Presumably we expect almost all nodes' mempool limits to be at least an order of magnitude greater than the block weight limit, which means that any tx that falls into this category has at least ten blocks worth of better transactions than it in the mempool, and we wouldn't expect it to be mined for 100 minutes. That's plenty of time for the peer to request the tx from another peer.

@naumenkogs
I don't think the topology probing technique you describe in #17303 (comment) is made any worse by this PR. We can always probe whether a tx is in a peer's mempool using the technique here: #16851 (comment).

ariard

Concept ACK on the privacy leak fix, i.e before this PR peers could have sent us unsolicited GETDATAs to identify tx in our mempool but not previously INVed. The strict requirement of tx m_time being earlier than m_last_inv_sent to be broadcast guarantee we don't send TX(tx) without INV them before.

On the potential DoS vector,

That's plenty of time for the peer to request the tx from another peer.

@jnewbery but I think Suhas point is requester is going to keep sending GETDATA as tx isn't erased from m_tx_process_time at NOTFOUND processing?

src/net_processing.cpp

jnewbery · 2019-11-14T18:25:20Z

I think Suhas point is requester is going to keep sending GETDATA as tx isn't erased from m_tx_process_time at NOTFOUND processing?

I think the concern is that a peer will send us an INV for a tx that then gets removed from its mempool because of SIZELIMIT. It'll reply with a NOTFOUND which we don't do anything with.

I think that's ok because one minute later we'll request from another peer that INV'ed the transaction to us:

bitcoin/src/net_processing.cpp

Line 4043 in cd6cb97

if (last_request_time <= current_time - GETDATA_TX_INTERVAL) {

and if the tx was removed from the first peer's mempool for SIZELIMIT, then one minute delay shouldn't be a problem (since we don't expect the tx to be mined any time soon).

ariard · 2019-11-14T20:09:38Z

I think that's ok because one minute later we'll request from another peer that INV'ed the transaction to us

Ok, I was making the assumption than the NOTFOUND peer is the origin one so you don't have an INV from other peers. Isn't our current logic going to loop on the sending GETDATA, receiving NOTFOUND sequence ? Or is this too an edge case?

sdaftuar · 2019-11-14T20:12:59Z

So the benefits of this PR are presumably to eliminate DoS attack surface on mapRelay (though I’m not sure anyone has articulated this clearly) and to solve a privacy issue. I think both those things could also be solved without introducing relay problems from announcing transactions and then not responding (and I’ve proposed one approach that I think would be satisfactory; I’m sure someone could probably propose a solution to both these things that involves keeping mapRelay around as well).

Anyway I don’t understand why we would be sloppy here; it’s not like relay degradation is some kind of fundamental trade-off with what we’re trying to accomplish. (And the DoS and privacy issues are not new, they have been around for a long time, so it doesn’t really seem like they are urgent to address either?)

(BTW if anyone wants to pick up #15505 and try to get that merged so that this gets unblocked, please go for it — I closed it earlier this year because there didn’t seem to be much review interest, and then reopened it in light of this PR to remind others of one approach to solving the issue that I thought needed to be fixed first, before we went in the direction proposed here.)

naumenkogs · 2019-11-15T00:55:54Z

@jnewbery

I don't think the topology probing technique you describe in #17303 (comment) is made any worse by this PR. We can always probe whether a tx is in a peer's mempool using the technique here: #16851 (comment).

I have never said it's made any worse. What I don't like is that in future when we'll want to fix privacy issue we might return mapRelay in a slightly different form, so I thought maybe we should do it right away here.
This is not a deal-breaker for me, but I want others to understand this while making a decision.

ajtowns · 2020-02-03T06:16:53Z

@naumenkogs

I have never said it's made any worse. What I don't like is that in future when we'll want to fix privacy issue we might return mapRelay in a slightly different form, so I thought maybe we should do it right away here.

I don't think we have a clear enough idea on how to fix privacy issues in non-token ways to do anything "right away" here? And if we'll want a different thing than mapRelay anyway, pulling mapRelay out now doesn't sound like it'll make things much more difficult?

ajtowns · 2020-02-03T11:18:37Z

I don't think this PR really makes reacting to notfounds a more urgent issue for all the reasons @jnewbery lists. But I had a look at #15505 again anyway, and think that logic like "if we see a notfound for an inflight tx, look through our other peers to reset the process_time for that tx that was soonest to right now" should be fine -- we don't need to take any new locks, and we're just iterating over and looking up some maps which should be plenty fast. Example patch at https://github.com/ajtowns/bitcoin/commits/202002-bump-notfound

I don't think the p2p_leak_tx test is quite right -- if I take away the "break" after a GETDATA fails, it continues for a while then fails on the txid in inbound_peer.last_message['inv'] assertion. Not sure why; maybe there's a race condition if the getdata comes in between the inv getting queued and actually send or similar?

maflcko · 2020-03-25T14:17:11Z

I don't think the p2p_leak_tx test is quite right -- if I take away the "break" after a GETDATA fails, it continues for a while then fails on the txid in inbound_peer.last_message['inv'] assertion. Not sure why; maybe there's a race condition if the getdata comes in between the inv getting queued and actually send or similar?

The test is "right". It is only single-use, as correctly assumed by you: In the next run after receiving the notfound, it will receive a bunch of invs of previous transactions (but not yet the inv of the current transaction), so the assertion will fail.

DrahtBot · 2020-05-12T03:04:03Z

🐙 This pull request conflicts with the target branch and needs rebase.

… for it

adamjonas · 2020-06-12T18:21:58Z

Looks like there are still a couple of references to mapRelay in net_processing.cpp left to remove.

maflcko · 2020-06-12T19:04:04Z

Please don't review yet. This has not yet properly been rebased.

maflcko · 2020-06-28T12:00:31Z

Closing for now. There have been some other changes, so if this still makes sense, it is probably best to start a new pull request

f32c408 Make sure unconfirmed parents are requestable (Pieter Wuille) c4626bc Drop setInventoryTxToSend based filtering (Pieter Wuille) 43f02cc Only respond to requests for recently announced transactions (Pieter Wuille) b24a17f Introduce constant for mempool-based relay separate from mapRelay caching (Pieter Wuille) a9bc563 Swap relay pool and mempool lookup (Pieter Wuille) Pull request description: This implements the follow-up suggested here: #18861 (comment) . Instead of checking `setInventoryTxToSend`, maintain an explicit bloom filter with the 3500 most recently announced invs, and permit fetching any of these as long as they're in the relay pool or the mempool. In addition, permit relay from the mempool after just 2 minutes instead of 15. This: * Fixes the brief opportunity an attacker has to request unannounced invs just after the connection is established (pointed out by naumenkogs, see #18861 (comment)). * Guarantees that locally resubmitted invs after `filterInventoryKnown` rolls over can still be requested (pointed out by luke-jr, see #18861 (comment)). It adds 37 KiB of filter per peer. This is also a step towards dropping the relay pool entirely and always relaying from the mempool directly (see #17303), but that is still blocked by dealing properly with NOTFOUNDs (see #18238). ACKs for top commit: jnewbery: reACK f32c408 jonatack: re-ACK f32c408 per `git range-diff f7c19e8 2da7ee3 f32c408` and redid the following: code review, thought about motivation, DoS and privacy aspects, debug build to check for warnings after updating Clang from 6 to 11 since last review. ajtowns: re-ACK f32c408 Tree-SHA512: aa05b9fd01bad59581c4ec91836a52d7415dc933fa49d4c4adced79aa25aaad51e11166357e8c8b29fbf6021a7401b98c21b850b5d8e8ad773fdb5d6608e1e85

f32c408 Make sure unconfirmed parents are requestable (Pieter Wuille) c4626bc Drop setInventoryTxToSend based filtering (Pieter Wuille) 43f02cc Only respond to requests for recently announced transactions (Pieter Wuille) b24a17f Introduce constant for mempool-based relay separate from mapRelay caching (Pieter Wuille) a9bc563 Swap relay pool and mempool lookup (Pieter Wuille) Pull request description: This implements the follow-up suggested here: bitcoin#18861 (comment) . Instead of checking `setInventoryTxToSend`, maintain an explicit bloom filter with the 3500 most recently announced invs, and permit fetching any of these as long as they're in the relay pool or the mempool. In addition, permit relay from the mempool after just 2 minutes instead of 15. This: * Fixes the brief opportunity an attacker has to request unannounced invs just after the connection is established (pointed out by naumenkogs, see bitcoin#18861 (comment)). * Guarantees that locally resubmitted invs after `filterInventoryKnown` rolls over can still be requested (pointed out by luke-jr, see bitcoin#18861 (comment)). It adds 37 KiB of filter per peer. This is also a step towards dropping the relay pool entirely and always relaying from the mempool directly (see bitcoin#17303), but that is still blocked by dealing properly with NOTFOUNDs (see bitcoin#18238). ACKs for top commit: jnewbery: reACK f32c408 jonatack: re-ACK f32c408 per `git range-diff f7c19e8 2da7ee3 f32c408` and redid the following: code review, thought about motivation, DoS and privacy aspects, debug build to check for warnings after updating Clang from 6 to 11 since last review. ajtowns: re-ACK f32c408 Tree-SHA512: aa05b9fd01bad59581c4ec91836a52d7415dc933fa49d4c4adced79aa25aaad51e11166357e8c8b29fbf6021a7401b98c21b850b5d8e8ad773fdb5d6608e1e85

fanquake added the P2P label Oct 29, 2019

fanquake requested review from sdaftuar and ajtowns October 29, 2019 18:53

maflcko force-pushed the 1910-p2pNoRemovedTxs branch from fae27e8 to faf64f7 Compare October 29, 2019 20:08

maflcko added the Waiting for author label Oct 30, 2019

maflcko removed the Waiting for author label Nov 6, 2019

maflcko force-pushed the 1910-p2pNoRemovedTxs branch 2 times, most recently from fadf136 to faa31e4 Compare November 6, 2019 16:25

maflcko changed the title ~~p2p: Stop relaying non-mempool txs~~ p2p: Stop relaying non-mempool txs, improve tx privacy Nov 6, 2019

maflcko changed the title ~~p2p: Stop relaying non-mempool txs, improve tx privacy~~ p2p: Stop relaying non-mempool txs, improve tx privacy slightly Nov 6, 2019

maflcko force-pushed the 1910-p2pNoRemovedTxs branch from faa31e4 to fa5f3da Compare November 11, 2019 20:38

jkczyz reviewed Nov 13, 2019

View reviewed changes

ariard reviewed Nov 14, 2019

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

luke-jr mentioned this pull request Jan 29, 2020

net: Remove forcerelay of rejected txs #17985

Merged

ajtowns removed their request for review February 3, 2020 11:19

DrahtBot mentioned this pull request Feb 11, 2020

Use wtxid for transaction relay #18044

Merged

2 tasks

DrahtBot mentioned this pull request Feb 11, 2020

P2P: Mempool tracks locally submitted transactions to improve wallet privacy #18038

Merged

ajtowns mentioned this pull request Mar 2, 2020

net_processing: Retry notfounds with more urgency #18238

Closed

DrahtBot mentioned this pull request Mar 20, 2020

Erlay: bandwidth-efficient transaction relay protocol #18261

Closed

maflcko force-pushed the 1910-p2pNoRemovedTxs branch 3 times, most recently from faa40d1 to fa470f9 Compare March 25, 2020 13:56

DrahtBot mentioned this pull request Mar 27, 2020

test: Add test for wtxid transaction relay #18446

Closed

DrahtBot mentioned this pull request Apr 29, 2020

[net processing] Drop unknown types in getdata #18808

Merged

maflcko force-pushed the 1910-p2pNoRemovedTxs branch 2 times, most recently from fa63155 to fa9b9d1 Compare April 29, 2020 19:59

DrahtBot mentioned this pull request May 4, 2020

Do not answer GETDATA for to-be-announced tx #18861

Merged

DrahtBot added the Needs rebase label May 12, 2020

maflcko force-pushed the 1910-p2pNoRemovedTxs branch 2 times, most recently from 7bfd0f1 to ca65330 Compare May 23, 2020 14:49

MarcoFalke added 2 commits May 23, 2020 11:01

p2p: Stop relaying non-mempool txs

11c0901

p2p: Stop relaying txs to nodes that could have never received an inv…

4dcefd7

… for it

maflcko force-pushed the 1910-p2pNoRemovedTxs branch from ca65330 to 4dcefd7 Compare May 23, 2020 15:05

sipa mentioned this pull request May 29, 2020

Only allow getdata of recently announced invs #19109

Merged

maflcko marked this pull request as draft June 12, 2020 19:33

maflcko closed this Jun 28, 2020

maflcko deleted the 1910-p2pNoRemovedTxs branch June 28, 2020 12:00

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

p2p: Stop relaying non-mempool txs, improve tx privacy slightly #17303

p2p: Stop relaying non-mempool txs, improve tx privacy slightly #17303

Uh oh!

Conversation

maflcko commented Oct 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdaftuar commented Oct 29, 2019

Uh oh!

maflcko commented Oct 29, 2019

Uh oh!

DrahtBot commented Oct 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

naumenkogs commented Nov 4, 2019

Uh oh!

maflcko commented Nov 6, 2019

Uh oh!

naumenkogs commented Nov 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New attack

Ideas

Crazy expensive but fundamental

Less expensive but limited

Uh oh!

jkczyz Nov 12, 2019

Choose a reason for hiding this comment

Uh oh!

maflcko Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

maflcko Mar 25, 2020

Choose a reason for hiding this comment

Uh oh!

jkczyz Nov 12, 2019

Choose a reason for hiding this comment

Uh oh!

maflcko Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

jkczyz commented Nov 13, 2019

Uh oh!

jnewbery commented Nov 13, 2019

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jnewbery commented Nov 14, 2019

Uh oh!

ariard commented Nov 14, 2019

Uh oh!

sdaftuar commented Nov 14, 2019

Uh oh!

naumenkogs commented Nov 15, 2019

Uh oh!

ajtowns commented Feb 3, 2020

Uh oh!

ajtowns commented Feb 3, 2020

Uh oh!

maflcko commented Mar 25, 2020

Uh oh!

DrahtBot commented May 12, 2020

Uh oh!

adamjonas commented Jun 12, 2020

Uh oh!

maflcko commented Jun 12, 2020

Uh oh!

maflcko commented Jun 28, 2020

Uh oh!

Uh oh!

maflcko commented Oct 29, 2019 •

edited

Loading

DrahtBot commented Oct 29, 2019 •

edited

Loading

naumenkogs commented Nov 6, 2019 •

edited

Loading