-
Notifications
You must be signed in to change notification settings - Fork 37.8k
[WIP] p2p: Add random txn's from mempool to GETBLOCKTXN #27086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
7dc8ea1
to
1d80598
Compare
In order to prevent fingerprinting, especially of our own txn's, this adds a ~0.5% chance that transactions already in our mempool get added to our GETBLOCKTXN request Nodes that have less complete mempools are likely to have fewer excess txn's to relay.
1d80598
to
db84b1f
Compare
It's an interesting observation that our responses to compact block announcements reveal something about our mempool, but I'm not sure it's worth the cost of addressing that:
If we wanted to do something about this information leak nonetheless, I believe the right approach would be using the |
I agree with @sipa, with a stronger emphasis that I would probably NACK this change because the cost of this fix is too high, and the privacy gain is too low. You may be interested in contributing to some SPV client implementation instead :) I'm curious how well they preserve privacy when they request transactions/blocks (that subset which is of interest to them specifically). E.g. whether they ask the same node to provide everything — then the node can correlate. |
Wouldn't it be better to not add wallet transactions to the mempool if we don't want peers to query our mempool for wallet transactions? See also #11887 (comment) (and all in- and out- links in this issue) |
Note that it's very easy for an adversary to change that by simply broadcasting simultaneous double-spends with the same fee. Indeed, n-way double spends broadcast to n different nodes is easy to do. So I don't think the observation that it works well right now is relevant to the adversarial case. |
Yeah, Approach NACK. I may be convinced that doing something to avoid mempool fingerprinting through GETBLOCKTXN is worth it, but if we want that, there are better ways than this.
That's fair; the other arguments are stronger.
I don't think that's a good idea. The point is that we shouldn't treat wallet transactions any differently from transactions received from other peers. If we don't add wallet transactions to the mempool but still relay them (because otherwise nobody will ever know about them), we're adding a giant fingerprint to identify our transactions (relayed but not in mempool...). I think the focus of this PR on wallet transactions in general is distracting. The issue, if any, is mempool fingerprinting. That might be used by attackers to learn about our wallet transactions, but also about many other things. But the solution isn't specific to wallet things; it should just be to prevent attackers from learning anything about our mempool transactions that haven't been announced to them. |
Yeah, I didn't mention this, but obviously we wouldn't relay them with the mempool. Doing a one-shot (tor-only) outbound connection to fan-out the tx (one-hop dandelion) without adding it to the mempool shouldn't leave a fingerprint, other than the one left by the tor-only connection, no? |
@MarcoFalke Oh, fair enough, that's a good idea (though it'd probably still need a fallback to normal relay after some delay if we don't observe the transaction being rumoured back to us). I also think it's orthogonal to the idea here, because even absent "first mile" wallet broadcast leakage, we still want the P2P network to obscure transaction relay beyond that. |
I wonder if that is worth it. Given this issue here (and past ones), it just seems hard to think about and any guarantees are at best brittle in an evolving P2P network. So, long term, assuming the private "first mile" privacy-preserving fan out stuff is available, users and wallets caring about it will probably use that. Attempts to optimize the normal relay to be equally privacy-preserving will always have a taste of a false promise and it might be more honest to just tell people to not rely on that. |
We can't rely on Tor for all wallet privacy, especially given that it's a centralized service that might just fail completely one day (and before that, it's hard to bound how much sufficiently powerful attackers can learn from traffic analysis in Tor). Privacy on a public network is always multi-faceted, and it's fair we can't make strong guarantees. But on the other hand, we go through pretty substantial efforts to hide lots of things on a best-effort basis, especially involving transaction relay. And they're not all reducible to protecting wallet privacy (there is eclipse attack protection, fingerprinting for connection graph information, ...). |
This PR does not seem to have conceptual support. Please leave a comment if you would like this to be reopened. |
As compact block completion works currently, nodes reveal precisely the subset of transactions from published blocks that they already have in their mempool when they make a
GETBLOCKTXN
request for the transactions that they are missing during compact block relay. The greatest danger here is that nodes will never request their own transactions. Given a "sufficient number" ofGETBLOCKTXN
's from a single peer, it will become possible to identify their wallet addresses with some degree of confidence.Assuming that all transactions except for a node's own, have a nonzero probability of not being in the node's mempool when a block is discovered, an attacker with an infinite set of
GETBLOCKTXN
's from a single peer that reuses a finite number of pubkeys will have 100% confidence about what addresses belong to that peer.I am not a statistician, but I am actively trying to see if I can work out how large, and whether the "sufficient number" that gives a reasonable degree of confidence about a peer-pubkey correlation is a realistic scenario or not.
This PR prevents mempool fingerprinting by randomly adding ~ 1 in 200 (0.5%) transactions from our mempool to our
GETBLOCKTXN
. Nodes that have less complete mempools (worse connections) will have fewer excess txn's to relay. (Nodes with 50% of block missing from mempool will tend to have about 5 excess transactions requested if there are 2000 txn's in a block) 0.5% is a number I mostly pulled out of thin air but a maximum impact of 0.5% seems like a reasonable price to pay if the fingerprinting attack described is realistic.