[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086

davidgumberg · 2023-02-11T21:13:24Z

As compact block completion works currently, nodes reveal precisely the subset of transactions from published blocks that they already have in their mempool when they make a GETBLOCKTXN request for the transactions that they are missing during compact block relay. The greatest danger here is that nodes will never request their own transactions. Given a "sufficient number" of GETBLOCKTXN's from a single peer, it will become possible to identify their wallet addresses with some degree of confidence.

Assuming that all transactions except for a node's own, have a nonzero probability of not being in the node's mempool when a block is discovered, an attacker with an infinite set of GETBLOCKTXN's from a single peer that reuses a finite number of pubkeys will have 100% confidence about what addresses belong to that peer.

I am not a statistician, but I am actively trying to see if I can work out how large, and whether the "sufficient number" that gives a reasonable degree of confidence about a peer-pubkey correlation is a realistic scenario or not.

This PR prevents mempool fingerprinting by randomly adding ~ 1 in 200 (0.5%) transactions from our mempool to our GETBLOCKTXN. Nodes that have less complete mempools (worse connections) will have fewer excess txn's to relay. (Nodes with 50% of block missing from mempool will tend to have about 5 excess transactions requested if there are 2000 txn's in a block) 0.5% is a number I mostly pulled out of thin air but a maximum impact of 0.5% seems like a reasonable price to pay if the fingerprinting attack described is realistic.

DrahtBot · 2023-02-11T21:13:26Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	naumenkogs
Approach NACK	sipa

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

In order to prevent fingerprinting, especially of our own txn's, this adds a ~0.5% chance that transactions already in our mempool get added to our GETBLOCKTXN request Nodes that have less complete mempools are likely to have fewer excess txn's to relay.

sipa · 2023-02-12T04:07:17Z

It's an interesting observation that our responses to compact block announcements reveal something about our mempool, but I'm not sure it's worth the cost of addressing that:

Blocks are rare, and very expensive to produce, meaning that per block only a few of our peers even get the chance to query us about it (and it's unaffordable to produce more close-to-tip blocks to trigger that).
Increasing the size of compact block responses may actually add to propagation latency, especially when it results in a response that now need more TCP packets (the bandwidth isn't the concern here).
Just empirically, compact block relay works very well (on my well-connected node without wallets, 91% of blocks are reconstructed without asking for any transactions; 3.6% need 1 transaction; 3.2% need 2 transactions; 0.5% need 3 transactions; 1.1% need several). So even when our peers get a chance to learn something, there generally is very little to learn.

If we wanted to do something about this information leak nonetheless, I believe the right approach would be using the m_recently_announced_invs filter which we maintain for all our peers, and just add all transactions to the compact block response that we haven't told our peer about yet (and if there are too many, perhaps just immediately fall back to standard block relay).

naumenkogs · 2023-02-13T08:58:31Z

I agree with @sipa, with a stronger emphasis that I would probably NACK this change because the cost of this fix is too high, and the privacy gain is too low.

You may be interested in contributing to some SPV client implementation instead :) I'm curious how well they preserve privacy when they request transactions/blocks (that subset which is of interest to them specifically). E.g. whether they ask the same node to provide everything — then the node can correlate.

maflcko · 2023-02-13T10:48:00Z

Wouldn't it be better to not add wallet transactions to the mempool if we don't want peers to query our mempool for wallet transactions?

See also #11887 (comment) (and all in- and out- links in this issue)

petertodd · 2023-02-13T13:07:52Z

Just empirically, compact block relay works very well

Note that it's very easy for an adversary to change that by simply broadcasting simultaneous double-spends with the same fee. Indeed, n-way double spends broadcast to n different nodes is easy to do. So I don't think the observation that it works well right now is relevant to the adversarial case.

sipa · 2023-02-13T14:41:28Z

I agree with @sipa, with a stronger emphasis that I would probably NACK this change because the cost of this fix is too high, and the privacy gain is too low.

Yeah, Approach NACK. I may be convinced that doing something to avoid mempool fingerprinting through GETBLOCKTXN is worth it, but if we want that, there are better ways than this.

Note that it's very easy for an adversary to change that by simply broadcasting simultaneous double-spends with the same fee. Indeed, n-way double spends broadcast to n different nodes is easy to do. So I don't think the observation that it works well right now is relevant to the adversarial case.

That's fair; the other arguments are stronger.

Wouldn't it be better to not add wallet transactions to the mempool if we don't want peers to query our mempool for wallet transactions?

I don't think that's a good idea. The point is that we shouldn't treat wallet transactions any differently from transactions received from other peers. If we don't add wallet transactions to the mempool but still relay them (because otherwise nobody will ever know about them), we're adding a giant fingerprint to identify our transactions (relayed but not in mempool...).

I think the focus of this PR on wallet transactions in general is distracting. The issue, if any, is mempool fingerprinting. That might be used by attackers to learn about our wallet transactions, but also about many other things. But the solution isn't specific to wallet things; it should just be to prevent attackers from learning anything about our mempool transactions that haven't been announced to them.

maflcko · 2023-02-13T14:45:16Z

If we don't add wallet transactions to the mempool but still relay them

Yeah, I didn't mention this, but obviously we wouldn't relay them with the mempool. Doing a one-shot (tor-only) outbound connection to fan-out the tx (one-hop dandelion) without adding it to the mempool shouldn't leave a fingerprint, other than the one left by the tor-only connection, no?

sipa · 2023-02-13T14:48:10Z

@MarcoFalke Oh, fair enough, that's a good idea (though it'd probably still need a fallback to normal relay after some delay if we don't observe the transaction being rumoured back to us). I also think it's orthogonal to the idea here, because even absent "first mile" wallet broadcast leakage, we still want the P2P network to obscure transaction relay beyond that.

maflcko · 2023-02-13T15:03:07Z

we still want the P2P network to obscure transaction relay beyond that

I wonder if that is worth it. Given this issue here (and past ones), it just seems hard to think about and any guarantees are at best brittle in an evolving P2P network. So, long term, assuming the private "first mile" privacy-preserving fan out stuff is available, users and wallets caring about it will probably use that. Attempts to optimize the normal relay to be equally privacy-preserving will always have a taste of a false promise and it might be more honest to just tell people to not rely on that.

sipa · 2023-02-13T15:15:24Z

We can't rely on Tor for all wallet privacy, especially given that it's a centralized service that might just fail completely one day (and before that, it's hard to bound how much sufficiently powerful attackers can learn from traffic analysis in Tor).

Privacy on a public network is always multi-faceted, and it's fair we can't make strong guarantees. But on the other hand, we go through pretty substantial efforts to hide lots of things on a best-effort basis, especially involving transaction relay. And they're not all reducible to protecting wallet privacy (there is eclipse attack protection, fingerprinting for connection graph information, ...).

achow101 · 2023-04-25T16:15:30Z

This PR does not seem to have conceptual support. Please leave a comment if you would like this to be reopened.

davidgumberg force-pushed the wip-rndtxinclude branch 2 times, most recently from 7dc8ea1 to 1d80598 Compare February 11, 2023 22:46

davidgumberg force-pushed the wip-rndtxinclude branch from 1d80598 to db84b1f Compare February 11, 2023 22:55

glozow added the P2P label Feb 13, 2023

achow101 closed this Apr 25, 2023

Crypt-iQ mentioned this pull request Aug 15, 2023

compact block fingerprinting #28272

Open

bitcoin locked and limited conversation to collaborators Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086

[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086

Uh oh!

davidgumberg commented Feb 11, 2023 •

edited

Loading

Uh oh!

DrahtBot commented Feb 11, 2023 •

edited

Loading

Uh oh!

sipa commented Feb 12, 2023

Uh oh!

naumenkogs commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

petertodd commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023 •

edited

Loading

Uh oh!

achow101 commented Apr 25, 2023

Uh oh!

Uh oh!

[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086

[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086

Uh oh!

Conversation

davidgumberg commented Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Uh oh!

sipa commented Feb 12, 2023

Uh oh!

naumenkogs commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

petertodd commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023

Uh oh!

maflcko commented Feb 13, 2023

Uh oh!

sipa commented Feb 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

achow101 commented Apr 25, 2023

Uh oh!

Uh oh!

davidgumberg commented Feb 11, 2023 •

edited

Loading

DrahtBot commented Feb 11, 2023 •

edited

Loading

sipa commented Feb 13, 2023 •

edited

Loading