-
Notifications
You must be signed in to change notification settings - Fork 3.4k
DNSCache: TTL cleanup implementation. #6740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
daemon/fqdn.go
Outdated
@@ -100,6 +101,17 @@ func (d *Daemon) bootstrapFQDN(restoredEndpoints *endpointRestoreState) (err err | |||
fqdn.StartDNSPoller(d.dnsPoller) | |||
} | |||
|
|||
// Controller to cleanup TTL expired entries from the DNS policies. | |||
controller.NewManager().UpdateController("dns-ttl-cleanup-job", controller.ControllerParams{ | |||
RunInterval: time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably the wrong person to review this PR but do we need to run this controller every second? Can't this be event based on the TTL of the entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, each second the cleanup need to happens to clean the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the entries don't have a TTL for every second
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple dns queries has multiple DNS TTL, so each second we need to clean the TTL that has expired after X seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge is knowing when to wake up to expire a specific IP and regenerate policy. We know the expiration time of each entry and setting a timer for the next one to expire is straight-forward Figuring out the expire time for the one after the next is annoying. We would need something like a priority queue (with O(logn) best case update). In theory, most updates will add a new time to the end of this queue (i.e. a new latest time). Still, we can have situations where new entries will expire somewhere in the middle.
Binning the entries into seconds seems like the lowest overhead, since we have as many map entries as we have IPs (or less). The lookup is also constant time for a specific second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that we need 1 second accuracy for the TTL. Isn't it enough to expire every 15 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to change, otherwise LGTM!
pkg/fqdn/cache.go
Outdated
// CleanupExpiredEntries cleans all the expired entries from the lastTime that | ||
// runs to the give time. It will lock the struct until retrieves all the data. | ||
// It returns the list of names that need to be deleted from the policies. | ||
func (c *DNSCache) CleanupExpiredEntries(expires time.Time) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's return lastCleanup as well, this way it will be clear what the time range was for logging etc.
pkg/fqdn/cache.go
Outdated
expiredEntries = append(expiredEntries, entries...) | ||
delete(c.cleanup, key) | ||
} | ||
return expiredEntries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we now do the actual removal of cacheEntry objects here, we need to do a :
for _, name := range expiredEntries {
if entries, exists := c.forward[name]; exists {
entries.removeExpired(expiredTime)
}
}
We also might want to sort and unique the names. If we do both, the loop above already does the uniqueing because of the exists check (if we add to a list only when exists == true).
pkg/fqdn/cache.go
Outdated
lastCleanup time.Time | ||
|
||
// cleanup is the map of TTL expiration time and the entries that are going | ||
// to expire at that second. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention that the expiration time is in seconds, and that it is expected that we will delete entries in CleanupExpiredEntries
pkg/fqdn/cache.go
Outdated
// When lookupTime is much earlier than time.Now(), we may not expire all | ||
// entries that should be expired, leaving more work for .Lookup. | ||
c.removeExpired(entries, time.Now()) | ||
c.addNameToCleanup(entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line needs to be in updateWithEntryIPs. This is because we're tracking by IP, but store the whole DNS lookup. The entry may not actually go into the cache if, for a particular IP, there is another lookup that expires later.
In the cases were, for an IP, we are replacing an existing entry, we might need to also replace the cleanup entry. This is because that IP won't actually expire at the original time anymore. It is a little harmless, though.
daemon/fqdn.go
Outdated
@@ -100,6 +101,17 @@ func (d *Daemon) bootstrapFQDN(restoredEndpoints *endpointRestoreState) (err err | |||
fqdn.StartDNSPoller(d.dnsPoller) | |||
} | |||
|
|||
// Controller to cleanup TTL expired entries from the DNS policies. | |||
controller.NewManager().UpdateController("dns-ttl-cleanup-job", controller.ControllerParams{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were right: putting this here makes things a little complicated. We now need a controller for the Endpoint.DNSHistory caches(unless some of the changes below around replacing cleanup entries go in, then it might end up with a steady memory usage).
8ad7ec0
to
5dcc6c1
Compare
daemon/fqdn.go
Outdated
@@ -100,6 +101,17 @@ func (d *Daemon) bootstrapFQDN(restoredEndpoints *endpointRestoreState) (err err | |||
fqdn.StartDNSPoller(d.dnsPoller) | |||
} | |||
|
|||
// Controller to cleanup TTL expired entries from the DNS policies. | |||
controller.NewManager().UpdateController("dns-ttl-cleanup-job", controller.ControllerParams{ | |||
RunInterval: time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge is knowing when to wake up to expire a specific IP and regenerate policy. We know the expiration time of each entry and setting a timer for the next one to expire is straight-forward Figuring out the expire time for the one after the next is annoying. We would need something like a priority queue (with O(logn) best case update). In theory, most updates will add a new time to the end of this queue (i.e. a new latest time). Still, we can have situations where new entries will expire somewhere in the middle.
Binning the entries into seconds seems like the lowest overhead, since we have as many map entries as we have IPs (or less). The lookup is also constant time for a specific second.
pkg/fqdn/cache.go
Outdated
c.removeExpired(entries, c.lastCleanup) | ||
} | ||
} | ||
return uniqExpires, c.lastCleanup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I meant the original lastCleanup at the start of this function, not the final one (which is going to be roughly expires
).
func (c *DNSCache) addNameToCleanup(entry *cacheEntry) { | ||
expiration := int64(entry.TTL) + time.Now().Unix() | ||
expiredEntries, exists := c.cleanup[expiration] | ||
if !exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised that the append means we can avoid this if, if you prefer
pkg/fqdn/cache.go
Outdated
// delete the entry from the policy when it expires. | ||
// Need to be called with the lock | ||
func (c *DNSCache) addNameToCleanup(entry *cacheEntry) { | ||
expiration := int64(entry.TTL) + time.Now().Unix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realised that this isn't right. We already set the ExpirationTime in the entry and use that everywhere else. We should probably use it here too, so it is consistent. It will also apply the TTL correctly relative to the lookup time of the data, instead of time.Now()
(which will be within the same second most of the time so it would mostly work this way).
pkg/fqdn/cache.go
Outdated
@@ -268,6 +323,7 @@ func (c *DNSCache) updateWithEntryIPs(entries ipEntries, entry *cacheEntry) { | |||
c.upsertReverse(ipStr, entry) | |||
} | |||
} | |||
c.addNameToCleanup(entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be called when we call c.upsertReverse
(just above) otherwise we introduce expirations that aren't actually happening in the cache.
5dcc6c1
to
6a8c7eb
Compare
daemon/fqdn.go
Outdated
@@ -100,6 +101,17 @@ func (d *Daemon) bootstrapFQDN(restoredEndpoints *endpointRestoreState) (err err | |||
fqdn.StartDNSPoller(d.dnsPoller) | |||
} | |||
|
|||
// Controller to cleanup TTL expired entries from the DNS policies. | |||
controller.NewManager().UpdateController("dns-ttl-cleanup-job", controller.ControllerParams{ | |||
RunInterval: time.Second, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that we need 1 second accuracy for the TTL. Isn't it enough to expire every 15 seconds?
daemon/fqdn.go
Outdated
d.dnsRuleGen.ForceGenerateDNS(namesToClean) | ||
return nil | ||
}, | ||
StopFunc: func() error { return nil }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be omitted
6a8c7eb
to
f8ab6e6
Compare
@@ -165,3 +165,17 @@ func sortedIPsAreEqual(a, b []net.IP) bool { | |||
func prepareMatchName(matchName string) string { | |||
return strings.ToLower(dns.Fqdn(matchName)) | |||
} | |||
|
|||
func KeepUniqueNames(names []string) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function KeepUniqueNames should have comment or be unexported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight preference for only using the slice instead of making a map. In any case, however we do it we should document the function and explain that the output slice is the same or different.
|
||
// NewDNSCache returns an initialized DNSCache and set the max host limit to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on exported function NewDNSCacheWithLimit should be of the form "NewDNSCacheWithLimit ..."
pkg/fqdn/cache.go
Outdated
@@ -26,6 +26,9 @@ import ( | |||
"github.com/cilium/cilium/pkg/lock" | |||
) | |||
|
|||
// defines the maximum number of IPs to maintain for each FQDN name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on exported var FQDNMaxIPperHost should be of the form "FQDNMaxIPperHost ..."
daemon/fqdn.go
Outdated
for i, ep := range endpoints { | ||
cfg.Cache.UpdateFromCache(ep.DNSHistory, namesToClean, i == 0) | ||
} | ||
log.Infof("FQDN garbage collector work deleted %d name entries", len(namesToClean)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is probably a debug... I'm not sure how relevant it is without the list of names deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a info message to be able to track this down in the future if any issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shrug. People keep complaining to me about log lines.
Anyway, can you make it a scopedLog with controller name or whatever we usually put into the controllers? that will make it easier to filter for it.
daemon/fqdn.go
Outdated
} | ||
// A second loop is needed to update the global cache from the | ||
// endpoints cache | ||
for i, ep := range endpoints { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also mention: The first update from an endpoint will clear the data in cfg.Cache for each of namesToClean. Looping this way is generally safe despite not locking; If a new lookup happens during these updates the new DNS data will be reinserted from the endpoint.DNSHistory cache that made the request.
pkg/fqdn/cache.go
Outdated
cleanup map[int64][]string | ||
|
||
// overLimit is a map with the host names that has reached the ips limits | ||
// defined by the DNSCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined by limit, no?
pkg/fqdn/cache.go
Outdated
@@ -154,16 +181,124 @@ func (c *DNSCache) updateWithEntry(entry *cacheEntry) { | |||
c.forward[entry.Name] = entries | |||
} | |||
c.updateWithEntryIPs(entries, entry) | |||
if len(entries) > FQDNMaxIPperHost { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now needs to use c.limit
pkg/fqdn/cache.go
Outdated
overLimit map[string]bool | ||
|
||
// limit is the number of maximum number of IP per host. | ||
limit int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perHostLimit
will be more clear when using it later.
pkg/fqdn/cache.go
Outdated
|
||
// cleanupOverLimitEntries returns the names that has reached the max number of | ||
// IP per host. | ||
// Internally the function sort the entries by the TTL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say sorted by expiration time. TTL might be ambiguous and mean the actual TTL seconds value.
pkg/fqdn/cache.go
Outdated
func (c *DNSCache) UpdateFromCache(update *DNSCache) { | ||
// individually. If any item in the names slice will only update that names. | ||
// If the clean argument is set to true, all entries will be clean. | ||
func (c *DNSCache) UpdateFromCache(update *DNSCache, names []string, clean bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names might be better as namesToUpdate or something,
the comment is a little unclear, maybe:
When names is non-nil only those names are updated from update, otherwise all DNS names in update are used.
When clean is true the data in c for each specific name is cleared before the update.
pkg/option/config.go
Outdated
@@ -260,6 +260,10 @@ const ( | |||
// ToFQDNsEmitPollerEvents controls if poller lookups are sent as monitor events | |||
ToFQDNsEnablePollerEvents = "tofqdns-enable-poller-events" | |||
|
|||
// ToFQDNsMaxIPsPerHost defines the maximum number of IPs to maintain | |||
// for each FQDN name in an endpoint's FQDN cache | |||
ToFQDNsMaxIPsPerHost = "tofqdns-max-ip-per-hostname" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tofqdns-endpoint-max-ip-per-hostname
might be a little more clear but it is longer.
73f5cc3
to
d9184e2
Compare
test-me-please |
daemon/fqdn.go
Outdated
for i, ep := range endpoints { | ||
cfg.Cache.UpdateFromCache(ep.DNSHistory, namesToClean, i == 0) | ||
} | ||
log.Infof("FQDN garbage collector work deleted %d name entries", len(namesToClean)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shrug. People keep complaining to me about log lines.
Anyway, can you make it a scopedLog with controller name or whatever we usually put into the controllers? that will make it easier to filter for it.
pkg/fqdn/cache.go
Outdated
// UpdateFromCache is a utility function that allows updating a DNSCache | ||
// instance with all the internal entries of another. Latest-Expiration still | ||
// applies, thus the merged outcome is consistent with adding the entries | ||
// individually. | ||
func (c *DNSCache) UpdateFromCache(update *DNSCache) { | ||
// When namesToUpdate is non-nil only those names are updated from update, otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this isn't wholly true. "When namesToUpdate has non-zero length" might be more correct (since someone might pass in an empty list thinking it will be a no-op).
@@ -173,8 +306,19 @@ func (c *DNSCache) UpdateFromCache(update *DNSCache) { | |||
|
|||
c.Lock() | |||
defer c.Unlock() | |||
|
|||
for _, newEntries := range update.forward { | |||
if len(namesToUpdate) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's better to convert the string slice to a map when len(namesToUpdate) != 0
instead. This way, the default case just uses update.forward without doing extra work. It probably won't matter too much in the end.
pkg/fqdn/cache.go
Outdated
continue | ||
} | ||
if clean { | ||
c.removeExpired(newEntries, time.Now(), time.Now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realised that this is wrong. c.removeExpired should only work on entries in c, and not from update. The effect here would be that removeReverse inside removeExpired etc. wouldn't work correctly.
There's also another problem with the assumption I made yesterday. If the first endpoint.DNSHistory doesn't have name
then we won't ever clean it. Another problem is that the time.Now() here leaves a bit of a race.
So, I think we are better off not having clean in this function, but call ForceExpire before we call this function in the controller. This should keep things relatively consistent (at the cost of a regex compile).
} | ||
if clean { | ||
c.removeExpired(newEntries, time.Now(), time.Now()) | ||
} | ||
for _, newEntry := range newEntries { | ||
c.updateWithEntry(newEntry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related with this PR, but could you add a comment like:
using newEntry from update violates encapsulation but these objects are immutable and so it should be harmless to do this (but still a bad idea).
Alternatively, we can copy newEntry (The type is shallow-copyable in general, unless we also play with IP slices, and if we do we deserve bugs).
} | ||
toClean := []string{} | ||
|
||
// For global cache the limit maybe is not used at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without knowing about the various instances of this type that we use, this is a little confusing. If you wanted to make sure it was clear, maybe add this comment where we instantiate fqdn.DefaultDNSCache
?
} | ||
} | ||
|
||
} | ||
|
||
// removeExpired removes expired (or nil) cacheEntry pointers from entries, an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we know the entry we are expiring, we can also remove it from cleanup. This way if someone forgets to run GC we won't leak memory.
|
||
// cleanup is the map of TTL expiration time and the entries that are going | ||
// to expire will be deleted by CleanupExpiredEntries. | ||
cleanup map[int64][]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explain how cleanup and overlimit are being used more completely here. The difference between them is a little unclear. It's possible something like expirationTimes
is a better name but it is longer.
For cleanup:
cleanup maps the TTL expiration times (in seconds since the epoch) to DNS names that expire in that second. On every new insertion where the new data is actually inserted into the cache (i.e. it expires later than an existing entry) cleanup will be updated. CleanupExpiredEntries cleans up these entries on demand.
Note: Lookup functions will not return expired entries, and this is used to proactively enforce expirations.
Note: It is important to periodically call CleanupExpiredEntries otherwise this map will grow forever.
(this last note isn't needed if we remove entries from cleanup during removeExpired
)
For overlimit:
overlimit is a set of DNS names that were over the per-host configured limit when they received an update. The excess IPs will be removed when cleanupOverLimitEntries
is called, but will continue to be returned by Lookup until then.
Note: It is important to periodically call GC otherwise this map will grow forever (it is very bounded, however).
(this last note isn't needed if we remove entries from overLimit when we insert IPs).
pkg/fqdn/cache.go
Outdated
// When lookupTime is much earlier than time.Now(), we may not expire all | ||
// entries that should be expired, leaving more work for .Lookup. | ||
c.removeExpired(entries, time.Now(), time.Time{}) | ||
} | ||
|
||
// AddNameToCleanup adds the IP with the given TTL to the the cleanup map to | ||
// delete the entry from the policy when it expires. | ||
// Need to be called with the lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a write lock (I think we make the distinction elsewhere)
pkg/fqdn/cache.go
Outdated
reverse: make(map[string]nameEntries), | ||
lastCleanup: time.Now(), | ||
cleanup: map[int64][]string{}, | ||
overLimit: map[string]bool{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we set every other field, perhaps we should also set perHostLimit. It is redundant but makes it clear that it was intended. shrug.
d9184e2
to
1e64506
Compare
d3cd5cd
to
76a7b58
Compare
test-me-please (rebased with master) |
sorry @eloycoto I accidentally pressed close! |
b400cce
to
80e6795
Compare
test-me-please |
Needs a quick refresh:
|
80e6795
to
3e2ddf8
Compare
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for CLI
The reason for this commit is to avoid large list of identities on fqdn policy and implement the cleanup process to clean expires IPS and avoid large times on policyCalculation. The reason to use a cleanup map in dnsCache is to avoid to loop all the DNS cache entries, so each second we have a map and a controller will clean the entries that has some ips expired at that point of time. The reason to use the controller, is to avoid to have a goroutine with a timer.Tick each second, and have all in the same goroutine. To avoid leaks, instead the passing time.Now() use the cache.lastCleanup to get all the entries from the last run, policy regeneration can take more than 40 seconds in some huge policies and with this method leak seconds entries int he cleanup map will not happen. Tried also with a callback function, but that can block the cache in case of the long policyCalculation times and DNS proxy will not resolve/response correctly. Also tried with the channel, but need to have a buffer and the Cilium way is more related to use a controller. Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
On `updateIPsForName` if the new data has already expired or the old data also expired can be issue in regeneration. Fix cilium#6796 Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
3e2ddf8
to
0e90cfd
Compare
test-me-please |
Fixed conflict |
The reason for this commit is to avoid large list of identities on fqdn policy
and implement the cleanup process to clean expires IPS and avoid large times on
policyCalculation.
The reason to use a cleanup map in dnsCache is to avoid to loop all the DNS
cache entries, so each second we have a map and a controller will clean the
entries that has some ips expired at that point of time.
The reason to use the controller, is to avoid to have a goroutine with a
timer.Tick each second, and have all in the same goroutine. To avoid leaks,
instead the passing time.Now() use the cache.lastCleanup to get all the entries
from the last run, policy regeneration can take more than 40 seconds in some
huge policies and with this method leak seconds entries int he cleanup map will
not happen.
Tried also with a callback function, but that can block the cache in case of the
long policyCalculation times and DNS proxy will not resolve/response correctly.
Also tried with the channel, but need to have a buffer and the Cilium way is
more related to use a controller.
This PR will not work if #6692 is not merged.
Signed-off-by: Eloy Coto eloy.coto@gmail.com
This change is