Skip to content

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented May 24, 2022

This commit fixes a bug where the keys of the forward map inside the DNS
cache were never removed, causing the map to grow forever. By contrast,
the reverse map keys were being deleted.

For both the forward and reverse maps (which are both maps whose values
are another map), the inner map keys were being deleted.

In other words, the delete on the outer map key was missing for the
forward map.

In addition to fixing the bug, this commit expands the unit test
coverage to assert after any deletes (entries expiring or GC) that the
forward and reverse maps contain what we expect.

Particularly, in an environment where there are many unique DNS lookups
(unique FQDNs) being done, this forward map could grow quite large over
time, especially for a long-lived workload (endpoint). This fixes this
memory-leak-like bug.

Fixes: cf387ce ("fqdn: Introduce TTL-aware cache for DNS retention")
Fixes: f6ce522 ("FQDN: Added garbage collector functions.")

Signed-off-by: Chris Tarazi chris@isovalent.com

Fix memory leak in the DNS cache when a long-lived endpoint makes many unique DNS lookups over time

@christarazi christarazi added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels May 24, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 24, 2022
@christarazi christarazi added area/daemon Impacts operation of the Cilium daemon. needs-backport/1.9 labels May 24, 2022
@christarazi
Copy link
Member Author

/test

@christarazi christarazi marked this pull request as ready for review May 24, 2022 01:18
@christarazi christarazi requested review from a team, tommyp1ckles and jrajahalme and removed request for a team May 24, 2022 01:18
@christarazi
Copy link
Member Author

christarazi commented May 24, 2022

/mlh new-flake Cilium-PR-Runtime-net-next

👍 created #19950

Edit: CI passed except for known flake above. Pushing to address #19925 (comment).

@christarazi christarazi force-pushed the pr/christarazi/fix-fqdn-memory-leak-map branch 2 times, most recently from 2eaa819 to e6546ac Compare May 24, 2022 17:46
This commit fixes a bug where the keys of the forward map inside the DNS
cache were never removed, causing the map to grow forever. By contrast,
the reverse map keys were being deleted.

For both the forward and reverse maps (which are both maps whose values
are another map), the inner map keys were being deleted.

In other words, the delete on the outer map key was missing for the
forward map.

In addition to fixing the bug, this commit expands the unit test
coverage to assert after any deletes (entries expiring or GC) that the
forward and reverse maps contain what we expect.

Particularly, in an environment where there are many unique DNS lookups
(unique FQDNs) being done, this forward map could grow quite large over
time, especially for a long-lived workload (endpoint). This fixes this
memory-leak-like bug.

Fixes: cf387ce ("fqdn: Introduce TTL-aware cache for DNS retention")
Fixes: f6ce522 ("FQDN: Added garbage collector functions.")

Signed-off-by: Chris Tarazi <chris@isovalent.com>
@christarazi christarazi force-pushed the pr/christarazi/fix-fqdn-memory-leak-map branch from e6546ac to 04317b0 Compare May 24, 2022 17:48
@christarazi
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 2, 2022
@joamaki joamaki merged commit f439177 into cilium:master Jun 3, 2022
@christarazi christarazi deleted the pr/christarazi/fix-fqdn-memory-leak-map branch June 3, 2022 16:12
@joestringer joestringer added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Jun 9, 2022
@qmonnet qmonnet mentioned this pull request Jun 30, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants