-
Notifications
You must be signed in to change notification settings - Fork 3.4k
v1.10 backports 2022-06-07 #20100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.10 backports 2022-06-07 #20100
Conversation
[ upstream commit 507332a ] Upon a successful DNS response, Cilium's DNS proxy code will sync the DNS history state to the individual endpoint's header file. Previously, this sync was done inside a trigger, however the calling code, (*Endpoint).SyncEndpointHeaderFile(), acquired a write-lock for no good reason. This effectively negated the benefits of having the DNS history sync behind a trigger of 5 seconds. This is especially suboptimal because the header file sync is actually causing Cilium to serialize processing the DNS request for a single endpoint. To illustrate the impact of the above a bit more concretely, if a single endpoint does 10 DNS requests at the same time, acquiring the write-lock causes the processing of those 10 requests to be done one at a time. For the sake of posterity, this is not the case if 10 endpoints were to make DNS requests in parallel. This obviously has a performance impact both in terms of being slow CPU-wise, but also memory-wise. Take for example a DNS request bursty environment, it could cause an uptick in memory usage due to many goroutines being created and blocking due to the serialized nature of locking. Now that the code is all executing behind a trigger, we can remove the lock completely and initialize the trigger setup where the Endpoint object is created (e.g. createEndpoint(), parseEndpoint()). Now the lock is only taken in every 5 seconds when the trigger runs. This should relieve the lock contention drastrically. For context, in a user's environment where the pprof was shared with us, there were around 440 goroutines with 203 of them stuck waiting inside SyncEndpointHeaderFile(). We can also modify SyncEndpointHeaderFile() to no longer return an error, because it's not possible for invoking the trigger to fail. If we fail to initialize the trigger itself, then we log an error, but this is essentially impossible because it can only fail if the trigger func is nil (which we control). Understanding the locking contention came from inspecting the pprof via the following command and subsequent code inspection. ``` go tool trace -http :8080 ./cilium ./pprof-trace ``` Suggested-by: Michi Mutsuzaki <michi@isovalent.com> Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 4c2c244 ] Use strings.Builder instead of fmt.Sprintf() and preallocate the size of the string so that Go doesn't need to over-allocate if the string ends up longer than what the buffer growth algorithm predicts. Results: ``` $ go test -v -run '^$' -bench 'BenchmarkFQDNSelectorString' -benchmem ./pkg/policy/api > old.txt $ go test -v -run '^$' -bench 'BenchmarkFQDNSelectorString' -benchmem ./pkg/policy/api > new.txt $ benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkFQDNSelectorString-8 690 180 -73.97% benchmark old allocs new allocs delta BenchmarkFQDNSelectorString-8 9 4 -55.56% benchmark old bytes new bytes delta BenchmarkFQDNSelectorString-8 288 208 -27.78% ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 351c5d8 ] SortedList() and FormatForKVStore() can be very hot code in environments where there's constant policy churn, especially CIDR policies where there can be a large number of CIDR labels. This commit adds benchmarks for later commits to use as a baseline. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 0790e07 ] FormatForKVStore() previously returned a string for no reason as every caller converted the return value to a byte slice. This allows us to eliminate string concatenation entirely and use the bytes.Buffer directly. Building on the above, given that SortedList() returns a byte slice and calls FormatForKVStore() for its output, we can optimize it with the same technique to eliminate string concatenation. Here are the benchmark comparisons: ``` $ go test -v -run '^$' -bench 'BenchmarkLabels_SortedList|BenchmarkLabel_FormatForKVStore' -benchmem ./pkg/labels > old.txt $ go test -v -run '^$' -bench 'BenchmarkLabels_SortedList|BenchmarkLabel_FormatForKVStore' -benchmem ./pkg/labels > new.txt $ benchcmp old.txt new.txt benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat benchmark old ns/op new ns/op delta BenchmarkLabels_SortedList-8 2612 1120 -57.12% BenchmarkLabel_FormatForKVStore-8 262 54.5 -79.18% benchmark old allocs new allocs delta BenchmarkLabels_SortedList-8 35 13 -62.86% BenchmarkLabel_FormatForKVStore-8 4 1 -75.00% benchmark old bytes new bytes delta BenchmarkLabels_SortedList-8 1112 664 -40.29% BenchmarkLabel_FormatForKVStore-8 96 48 -50.00% ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 5fa7ae2 ] Advanced users can configure the LRU size for the cache holding the compiled regex expressions of FQDN match{Pattern,Name}. This is useful if users are experiencing high memory usage spikes with many FQDN policies that have repeated matchPattern or matchName across many different policies. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 38c0036 ] When comparing efficieny of increasing the LRU size from 128 to 1024 with ~22k CNPs, we see the following results: ``` \# LRU size 128. $ go test -tags privileged_tests -v -run '^$' -bench Benchmark_perEPAllow_setPortRulesForID_large -benchmem -benchtime 1x -memprofile memprofile.out ./pkg/fqdn/dnsproxy > old.txt \# LRU size 1024. $ go test -tags privileged_tests -v -run '^$' -bench Benchmark_perEPAllow_setPortRulesForID_large -benchmem -benchtime 1x -memprofile memprofile.out ./pkg/fqdn/dnsproxy > new.txt $ benchcmp old.txt new.txt benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat benchmark old ns/op new ns/op delta Benchmark_perEPAllow_setPortRulesForID_large-8 3954101340 3010934555 -23.85% benchmark old allocs new allocs delta Benchmark_perEPAllow_setPortRulesForID_large-8 26480632 24167742 -8.73% benchmark old bytes new bytes delta Benchmark_perEPAllow_setPortRulesForID_large-8 2899811832 1824062992 -37.10% ``` Here's the raw test run with LRU size at 128: ``` Before (N=1) Alloc = 31 MiB HeapInuse = 45 MiB Sys = 1260 MiB NumGC = 15 After (N=1) Alloc = 445 MiB HeapInuse = 459 MiB Sys = 1260 MiB NumGC = 40 ``` Here's the raw test run with LRU size at 1024: ``` Before (N=1) Alloc = 31 MiB HeapInuse = 48 MiB Sys = 1177 MiB NumGC = 17 After (N=1) Alloc = 78 MiB HeapInuse = 93 MiB Sys = 1177 MiB NumGC = 53 ``` We can see that it's saving ~300MB. Furthermore, if we compare the memprofiles from the benchmark run via ``` go tool pprof -http :8080 -diff_base memprofile.out memprofile.1024.out ``` we see an ~800MB reduction in the regex compilation. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit ce9583d ] Following the previous commit's benchmark result, let's update the LRU default size to be 1024, given that it only results in a few 10's of MBs increase when the cache nears full. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 95362de ] This commit adds a new convenience functions to get a count of * how many entries are inside the DNS cache (IPs) * and how many FQDNs are inside the DNS cache It will be used by upcoming commits to expose these values as a metrics. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit d9ae1c4 ] This commit contains no functional changes and is only cosmetic to ease future commits when adding new FQDN-related metrics. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 47870fe ] This commit exposes new metrics that show the number of active names and IPs in the DNS cache, and number of alive FQDN connections that have expired (aka zombies) per endpoint. This is useful to track the endpoint's DNS cache and DNS zombie cache sizes over time. Note that these metrics are only updated during the FQDN GC which is currently invoked every minute. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 044681a ] This commit is to bump prometheus client library to the latest (e.g. v1.12.2) which will have a better support for go collector of different go versions, and fix NaN value. Fixes: cilium#19985 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment on GHA, the rest looks good for my changes.
@@ -121,6 +121,9 @@ replace ( | |||
|
|||
go.universe.tf/metallb => github.com/cilium/metallb v0.1.1-0.20210607221240-b4c60b959dd7 | |||
|
|||
// Make sure to use grpc v1.29.1 as grpc/naming package was removed in newer versions. | |||
google.golang.org/grpc => google.golang.org/grpc v1.29.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added due to:
go: finding module for package google.golang.org/grpc/naming
github.com/cilium/cilium/pkg/kvstore imports
go.etcd.io/etcd/clientv3 tested by
go.etcd.io/etcd/clientv3.test imports
go.etcd.io/etcd/integration imports
go.etcd.io/etcd/proxy/grpcproxy imports
google.golang.org/grpc/naming: module google.golang.org/grpc@latest found (v1.47.0), but does not contain package google.golang.org/grpc/naming
Perhaps better would be to bump etcd here as well? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably fine to keep the grpc lib pinned here. This is a v1.10 branch which we won't typically expect changes to vendor moving forward.
[ upstream commit 7c86fe7 ] When the service port for the peer service is not specified, it is automatically assigned port 80 or port 443 (when TLS is enabled). Using these ports make it easy to understand whether TLS is enabled for the service or not. Moreover, it makes the behavior consistent with the Hubble Relay service. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit f439177 ] This commit fixes a bug where the keys of the forward map inside the DNS cache were never removed, causing the map to grow forever. By contrast, the reverse map keys were being deleted. For both the forward and reverse maps (which are both maps whose values are another map), the inner map keys were being deleted. In other words, the delete on the outer map key was missing for the forward map. In addition to fixing the bug, this commit expands the unit test coverage to assert after any deletes (entries expiring or GC) that the forward and reverse maps contain what we expect. Particularly, in an environment where there are many unique DNS lookups (unique FQDNs) being done, this forward map could grow quite large over time, especially for a long-lived workload (endpoint). This fixes this memory-leak-like bug. Fixes: cf387ce ("fqdn: Introduce TTL-aware cache for DNS retention") Fixes: f6ce522 ("FQDN: Added garbage collector functions.") Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 3500290 ] This commit is to add ownerReferences for CiliumNodes created by CiliumExternalWorkload, so that we don't unintentionally GC invalid CN. Thanks to @nathanejohnson for reporting this issue. Fixes: cilium#19907 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit f9863ec ] Signed-off-by: Vlad Ungureanu <ungureanuvladvictor@gmail.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit aa7572b ] Signed-off-by: Vlad Ungureanu <ungureanuvladvictor@gmail.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit c6af580 ] This commit adds the `-o json` output to `cilium node list` and `cilium-health status`, as the text version of both does not contain all details. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
82f8351
to
3b1ebc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and looks good for my changes 🎖️
/test-backport-1.10 Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github-sec
review is not required anymore following drop of commit editing the workflow files, approving just to get rid of the review request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, my commits look good!
PRs skipped due conflicts:
Once this PR is merged, you can update the PR labels via:
or with