Skip to content

Conversation

naumenkogs
Copy link
Member

@naumenkogs naumenkogs commented Mar 24, 2020

Every time a fresh Bitcoin Core node starts, it makes a DNS query to learn about nodes in the network. This process leaks the privacy of those new nodes: every required DNS server and the corresponding infrastructure would be aware that a new node was spinned up in a particular internet segment.
The goal of this proposal is to reduce the number of those actors learning about a new node.
The way to achieve it is to keep DNS server caches updated, so that new nodes rarely hit anything past the early servers.

To keep them updated, every reachable node would periodically make all widely used DNS queries, thus, triggering DNS cache updates on the resolvers appearing on their path to the end DNS servers.
This, obviously, would leak more information about the existence of these reachable nodes. We think this is no big deal, because those nodes are already easy to find, since they are reachable.

Note that this helps only to those private nodes, which share a subnet with some reachable node running this code.

In future, it would be great to analyze the results of those queries against a local AddrMan to check for anomalies.

The idea was originally proposed by Greg Maxwell.

@naumenkogs naumenkogs changed the title Periodically update DNS caches for better privacy of non reachable-nodes Periodically update DNS caches for better privacy of non-reachable nodes Mar 24, 2020
@laanwj laanwj added the P2P label Mar 24, 2020
@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 24, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sipa
Copy link
Member

sipa commented Mar 24, 2020

Can this use the scheduler instead of using a separate thread?

@practicalswift
Copy link
Contributor

Concept ACK

@naumenkogs naumenkogs force-pushed the 2020_03_dns_cache_update branch from be78cdc to 5e87cba Compare March 24, 2020 16:42
@naumenkogs
Copy link
Member Author

@sipa yes! I simply forgot that this can be done with scheduler. Just force-pushed an update.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some style-nits

@maflcko
Copy link
Member

maflcko commented Mar 24, 2020

Can this use the scheduler instead of using a separate thread?

What is the maximum time this can take? If it is unbounded, it seems that this would be a trivial DOS attack against a node, no?

@naumenkogs naumenkogs force-pushed the 2020_03_dns_cache_update branch 2 times, most recently from 0527577 to 5299c3b Compare March 24, 2020 18:41
@naumenkogs
Copy link
Member Author

@sipa I was not aware that scheduled event will block other execution threads, and now that Marco pointed out that attack, i'm thinking separate thread makes more sense? We don't want to make any assumptions about these requests executing very fast.

@maflcko
Copy link
Member

maflcko commented Mar 24, 2020

LimitValidationInterfaceQueue is called in ActivateBestChain. So even validation itself can be blocked, if an attacker can make those requests go slow. Not sure if that is possible, but I wanted to raise the point.

@sipa
Copy link
Member

sipa commented Mar 26, 2020

@naumenkogs It won't block other threads, but it will delay other events in the same scheduler. I agree that it's probably too slow for that (I don't know what latency getaddrinfo can have, but I expect it may be substantial). An alternative is doing it from PeerValidationLogic::SendMessages, once a timer expires. I'm hammering on avoiding a separate thread for this, because threads have a significant memory footprint (several MB, at least), so a thread that's idle almost all the time is a waste.

@MarcoFalke Vaguely related, do we have any policies around what maximum execution duration things in the main scheduler are allowed to have?

@naumenkogs
Copy link
Member Author

naumenkogs commented Mar 26, 2020

Agreed that that RAM overhead of a new thread for just this is no good.

Yeah, I would say normal execution of this function takes 2s, abnormal may take...a minute even if everyone is honest, due to random internet/os glitches.
Do we have anything in our toolset to make sure Lookup() function doesn't take more than couple seconds? I don't see anything within scheduler. Maybe it's doable to add this feature?

An alternative crazy idea is to have a special thread for a number of rare not-so-critical functions, which might take long time (for example, I can imagine DumpAddresses takes long due to I/O issues).

Reachable nodes would query DNS servers every hour
for all widely used combinations of service bits.
This would give non-reachable nodes from the same subnets
more privacy, because they will be hitting early DNS caches.
@naumenkogs naumenkogs force-pushed the 2020_03_dns_cache_update branch from 5299c3b to de74888 Compare April 16, 2020 14:40
// the only way to check if it is reachable is to observe inbound connections.
// This check is made every time, because a node may become non-reachable
// due to external factors (e.g. local network settings update).
if (CountInboundConnections() == 0) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a -dnscloaking option to force it on/off?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Will add this config and add the documentation once we get more concept acks.

// updated to invalide records for that service bit.
std::vector<uint64_t> SERVICE_BITS_COMBINATIONS{
// Legacy nodes
NODE_NETWORK,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't these be requesting the bare name without any service bit specification?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be x1.dns.whatever.com?

std::vector<uint64_t> SERVICE_BITS_COMBINATIONS{
// Legacy nodes
NODE_NETWORK,
NODE_NETWORK_LIMITED,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What nodes would ever request this?


// query for all widely used combinations of service bits
for (uint64_t service_bits_combination : SERVICE_BITS_COMBINATIONS) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these really be done all at the same time? No new node would batch them like this...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one private node in the subnet will request one combination, another private node will request another one...

Why would you want the already public node separate cache updates in time?

threadDNSAddressSeed = std::thread(&TraceThread<std::function<void()> >, "dnsseed", std::function<void()>(std::bind(&CConnman::ThreadDNSAddressSeed, this)));
if (fListen && fNameLookup && !HaveNameProxy()) {
LogPrintf("Proactive querying DNS servers to update caches is enabled\n");
scheduler.scheduleEvery([this] { DNSCachesUpdate(); }, DNS_CACHES_UPDATE_INTERVAL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably there should be some randomness added to the interval for each iteration?

@TheBlueMatt
Copy link
Contributor

I thought we had two scheduler threads? If not we should just do that and then its not a problem.

{
int inbounds = 0;
LOCK(cs_vNodes);
for (const CNode* node : vNodes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we count that such connections are from non-local addresses? Should be an extra two LoC

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

LookupHost(host, ips, 1, true);
}
}
LogPrintf("DNS requests were made to update caches.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be logged to a debug category to avoid reducing the signal to noise in the default bitcoind output? We already print too much along the lines of "everything is working as expected" which makes it easy to miss the more relevant "oh, something seems broken" log messages :)

threadDNSAddressSeed = std::thread(&TraceThread<std::function<void()> >, "dnsseed", std::function<void()>(std::bind(&CConnman::ThreadDNSAddressSeed, this)));
if (fListen && fNameLookup && !HaveNameProxy()) {
LogPrintf("Proactive querying DNS servers to update caches is enabled\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be logged to a debug category to avoid reducing the signal to noise in the default bitcoind output?

@naumenkogs
Copy link
Member Author

naumenkogs commented Sep 29, 2021

Not sure what to do with this PR.
The show-stopper here was the discussion around the blocking behavior of DNS queries: the options were either add a new thread or employing existing threads for such non-critical task (see discussion).

It kind of went nowhere, so I'm gonna close this PR for now. If you have an opinion on the way to move forward with one of those paths, please go to the mentioned discussion.

If you have an alternative way to move forward, feel free to comment here and we can resurrect it.

@naumenkogs naumenkogs closed this Sep 29, 2021
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants