-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Periodically update DNS caches for better privacy of non-reachable nodes #18421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Can this use the scheduler instead of using a separate thread? |
Concept ACK |
be78cdc
to
5e87cba
Compare
@sipa yes! I simply forgot that this can be done with scheduler. Just force-pushed an update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some style-nits
What is the maximum time this can take? If it is unbounded, it seems that this would be a trivial DOS attack against a node, no? |
0527577
to
5299c3b
Compare
@sipa I was not aware that scheduled event will block other execution threads, and now that Marco pointed out that attack, i'm thinking separate thread makes more sense? We don't want to make any assumptions about these requests executing very fast. |
|
@naumenkogs It won't block other threads, but it will delay other events in the same scheduler. I agree that it's probably too slow for that (I don't know what latency @MarcoFalke Vaguely related, do we have any policies around what maximum execution duration things in the main scheduler are allowed to have? |
Agreed that that RAM overhead of a new thread for just this is no good. Yeah, I would say normal execution of this function takes 2s, abnormal may take...a minute even if everyone is honest, due to random internet/os glitches. An alternative crazy idea is to have a special thread for a number of rare not-so-critical functions, which might take long time (for example, I can imagine DumpAddresses takes long due to I/O issues). |
Reachable nodes would query DNS servers every hour for all widely used combinations of service bits. This would give non-reachable nodes from the same subnets more privacy, because they will be hitting early DNS caches.
5299c3b
to
de74888
Compare
// the only way to check if it is reachable is to observe inbound connections. | ||
// This check is made every time, because a node may become non-reachable | ||
// due to external factors (e.g. local network settings update). | ||
if (CountInboundConnections() == 0) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a -dnscloaking
option to force it on/off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Will add this config and add the documentation once we get more concept acks.
// updated to invalide records for that service bit. | ||
std::vector<uint64_t> SERVICE_BITS_COMBINATIONS{ | ||
// Legacy nodes | ||
NODE_NETWORK, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't these be requesting the bare name without any service bit specification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be x1.dns.whatever.com?
std::vector<uint64_t> SERVICE_BITS_COMBINATIONS{ | ||
// Legacy nodes | ||
NODE_NETWORK, | ||
NODE_NETWORK_LIMITED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What nodes would ever request this?
|
||
// query for all widely used combinations of service bits | ||
for (uint64_t service_bits_combination : SERVICE_BITS_COMBINATIONS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these really be done all at the same time? No new node would batch them like this...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, one private node in the subnet will request one combination, another private node will request another one...
Why would you want the already public node separate cache updates in time?
threadDNSAddressSeed = std::thread(&TraceThread<std::function<void()> >, "dnsseed", std::function<void()>(std::bind(&CConnman::ThreadDNSAddressSeed, this))); | ||
if (fListen && fNameLookup && !HaveNameProxy()) { | ||
LogPrintf("Proactive querying DNS servers to update caches is enabled\n"); | ||
scheduler.scheduleEvery([this] { DNSCachesUpdate(); }, DNS_CACHES_UPDATE_INTERVAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably there should be some randomness added to the interval for each iteration?
I thought we had two scheduler threads? If not we should just do that and then its not a problem. |
{ | ||
int inbounds = 0; | ||
LOCK(cs_vNodes); | ||
for (const CNode* node : vNodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we count that such connections are from non-local addresses? Should be an extra two LoC
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
LookupHost(host, ips, 1, true); | ||
} | ||
} | ||
LogPrintf("DNS requests were made to update caches.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be logged to a debug category to avoid reducing the signal to noise in the default bitcoind output? We already print too much along the lines of "everything is working as expected" which makes it easy to miss the more relevant "oh, something seems broken" log messages :)
threadDNSAddressSeed = std::thread(&TraceThread<std::function<void()> >, "dnsseed", std::function<void()>(std::bind(&CConnman::ThreadDNSAddressSeed, this))); | ||
if (fListen && fNameLookup && !HaveNameProxy()) { | ||
LogPrintf("Proactive querying DNS servers to update caches is enabled\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be logged to a debug category to avoid reducing the signal to noise in the default bitcoind output?
Not sure what to do with this PR. It kind of went nowhere, so I'm gonna close this PR for now. If you have an opinion on the way to move forward with one of those paths, please go to the mentioned discussion. If you have an alternative way to move forward, feel free to comment here and we can resurrect it. |
Every time a fresh Bitcoin Core node starts, it makes a DNS query to learn about nodes in the network. This process leaks the privacy of those new nodes: every required DNS server and the corresponding infrastructure would be aware that a new node was spinned up in a particular internet segment.
The goal of this proposal is to reduce the number of those actors learning about a new node.
The way to achieve it is to keep DNS server caches updated, so that new nodes rarely hit anything past the early servers.
To keep them updated, every reachable node would periodically make all widely used DNS queries, thus, triggering DNS cache updates on the resolvers appearing on their path to the end DNS servers.
This, obviously, would leak more information about the existence of these reachable nodes. We think this is no big deal, because those nodes are already easy to find, since they are reachable.
Note that this helps only to those private nodes, which share a subnet with some reachable node running this code.
In future, it would be great to analyze the results of those queries against a local AddrMan to check for anomalies.
The idea was originally proposed by Greg Maxwell.