allocator: Fix kvstore identity leak #34893

odinuge · 2024-09-16T09:48:14Z

This fixes an identity leak in the kvstore implementation that has been here since the beginning. If the kvstore-refresh sync dumps the loadKeys immediately before a key is released, it will try to update that key. In large clusters with a lot of identity and pod churn, we see this
happening multiple times an hour. It can be investigated by looking at
the "Re-created missing slave key" logline. This often leads to multiple
thousand extra identities in the clusters, that are unused and not
cleaned up.

This can also be easily mitigated by always grabbing the slaveKeysMutex - but that is most likely going to cause a lot of slowness to identity allocation on the agents, so that is probably not a good idea.

The term slave and master key is also a bit confusing, and sometimes its explained as ID and reference (eg. via acquire reference) - so getting a consistent and better naming scheme would be useful for the readability of the code. We can probably save that for another PR.

See the commits for a more detailed explanation.

Please ensure your pull request adheres to the following guidelines:

For first time contributors, read Submitting a pull request
All code is covered by unit and/or runtime tests where feasible.
All commits contain a well written commit description including a title,
description and a Fixes: #XXX line if the commit addresses a particular
GitHub issue.
If your commit description contains a Fixes: <commit-id> tag, then
please add the commit author[s] as reviewer[s] to this issue.
All commits are signed off. See the section Developer’s Certificate of Origin
Provide a title or release-note blurb suitable for the release notes.
Are you a user of Cilium? Please add yourself to the Users doc
Thanks for contributing!

Fixes: #35451

Fix identity leak for kvstore identity mode

odinuge · 2024-09-16T11:15:19Z

/test

pkg/kvstore/etcd.go

pkg/allocator/allocator.go

pkg/kvstore/allocator/allocator.go

github-actions · 2024-10-17T01:59:06Z

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

odinuge · 2024-10-17T12:02:11Z

Ahh. I'll rebase this and fix the conflict as well as remove the extra code in #34893 (comment) tomorrow morning.

odinuge · 2024-10-18T14:12:14Z

Got stuck with other work, but did the rebase now. Will read through on Monday and then remove the draft mark.

pkg/allocator/allocator.go

pkg/kvstore/allocator/doublewrite/backend.go

odinuge · 2024-10-21T08:51:55Z

After running the reproduction script in #35451 with this code, with a fix- pod prefix I get;

The following while the script creating and deleting pods is running;

$ kubectl exec -n kube-system daemonset/cilium -- cilium kvstore get --recursive "cilium/state/identities/v1/value/" 2>&1 |egrep "fix-"
cilium/state/identities/v1/value/k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default;k8s:io.cilium.k8s.policy.cluster=default;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=default;k8s:run=fix-745;/172.18.0.2 => 1377

After stopping, and all pods are deleted;

$ kubectl exec -n kube-system daemonset/cilium -- cilium kvstore get --recursive "cilium/state/identities/v1/value/" 2>&1 |egrep "fix-"
<no output>

pkg/kvstore/allocator/allocator.go

giorio94

Thanks for discovering this issue and proposing a fix. Your patch overall looks great to me -- I've just left a few minor comments inline.

pkg/kvstore/etcd.go

pkg/allocator/allocator.go

pkg/kvstore/allocator/allocator.go

odinuge · 2024-11-01T16:29:52Z

Thanks for looking at this @giorio94! Pushed a new version now.

Been running in a loop locally and seems to work as expected!

Running locally I see some of these (with a tiny kvstore-sync value);

$ kubectl logs -n kube-system -l k8s-app=cilium  -f |grep "Re-created" -A 2 -B 2
time="2024-11-01T16:25:24.806888127Z" level=info msg="Released last local use of key, invoking global release" key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=kvstorebackend
time="2024-11-01T16:25:24.925191043Z" level=info msg="Removed endpoint" ciliumEndpointName=default/fix-101 containerID=d42039ca5d containerInterface= datapathPolicyRevision=3 desiredPolicyRevision=3 endpointID=127 identity=17055 ipv4=10.244.1.118 ipv6= k8sPodName=default/fix-101 subsys=endpoint
time="2024-11-01T16:25:25.17641821Z" level=warning msg="Re-created missing slave key" key="cilium/state/identities/v1/value/k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default;k8s:io.cilium.k8s.policy.cluster=default;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=default;k8s:run=fix-101;/172.18.0.3" subsys=kvstorebackend
time="2024-11-01T16:25:25.176465627Z" level=warning msg="Releasing now unused key that was re-recreated" id=17055 key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=allocator
time="2024-11-01T16:25:25.17650171Z" level=info msg="Released last local use of key, invoking global release" key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=kvstorebackend

so it all seems to work as intended; and after all pods are gone, all the "slave keys" are released correctly and we have no leaks.

giorio94

Thanks, looks great to me 🚀 Just a couple of nits inline.

pkg/allocator/allocator.go

giorio94 · 2024-11-04T09:41:57Z

/test

The syncLocalKeys call will with some small probability re-create slave keys for identities that are no longer used on the node. This often happens immidiately after the last reference on the node is cleaned up, and the slave key is deleted during the Release portion of the allocator. This results in the slave key being present until the etcd lease expires - and that usually doesn't happen until the agent restarts. This is not an issue for the master keys - as that is not using a per-node lease, and is cleaned up by the operator at an interval. In large clusters with a lot of identity and pod churn, we see this happening multiple times an hour. It can be investigated by looking at the "Re-created missing slave key" logline. This often leads to multiple thousand extra identities in the clusters, that are unused and not cleaned up. This function will now ensure that as best as it can it won't upsert the key if its no longer in use. And, if its unused after the upsert is done, it grabs the lock to double check - and if its no longer in use, it will release it to ensure its cleared up. Signed-off-by: Odin Ugedal <ougedal@palantir.com> Signed-off-by: Odin Ugedal <odin@uged.al>

Signed-off-by: Odin Ugedal <odin@uged.al> Signed-off-by: Odin Ugedal <ougedal@palantir.com>

odinuge · 2024-11-04T10:25:09Z

/test

giorio94

Thanks!

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 16, 2024

odinuge commented Sep 16, 2024

View reviewed changes

pkg/kvstore/etcd.go Outdated Show resolved Hide resolved

odinuge commented Sep 16, 2024

View reviewed changes

pkg/allocator/allocator.go Outdated Show resolved Hide resolved

odinuge commented Sep 16, 2024

View reviewed changes

pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved

github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Oct 17, 2024

github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Oct 18, 2024

odinuge force-pushed the ou-identity-leak branch from 9d17929 to 7a2c1f5 Compare October 18, 2024 14:11

odinuge commented Oct 21, 2024

View reviewed changes

pkg/allocator/allocator.go Outdated Show resolved Hide resolved

odinuge commented Oct 21, 2024

View reviewed changes

pkg/kvstore/allocator/doublewrite/backend.go Outdated Show resolved Hide resolved

odinuge force-pushed the ou-identity-leak branch from 7a2c1f5 to b1dd783 Compare October 21, 2024 07:20

odinuge marked this pull request as ready for review October 21, 2024 08:47

odinuge requested review from a team as code owners October 21, 2024 08:48

odinuge requested review from giorio94 and youngnick October 21, 2024 08:48

youngnick reviewed Oct 24, 2024

View reviewed changes

pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved

giorio94 reviewed Oct 25, 2024

View reviewed changes

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 25, 2024

odinuge force-pushed the ou-identity-leak branch from b1dd783 to 9b1782b Compare October 29, 2024 12:46

odinuge mentioned this pull request Oct 29, 2024

allocator: Fix kvstore identity leak [alternative] #35617

Closed

8 tasks

odinuge force-pushed the ou-identity-leak branch from 9b1782b to e5dbba8 Compare November 1, 2024 16:27

odinuge force-pushed the ou-identity-leak branch from e5dbba8 to 16a8184 Compare November 4, 2024 08:26

giorio94 approved these changes Nov 4, 2024

View reviewed changes

pkg/allocator/allocator.go Show resolved Hide resolved

pkg/allocator/allocator.go Show resolved Hide resolved

pkg/allocator/allocator.go Outdated Show resolved Hide resolved

odinuge-work added 2 commits November 4, 2024 11:24

allocator: Add testcases to catch identity leaks

c1ad50a

Signed-off-by: Odin Ugedal <odin@uged.al> Signed-off-by: Odin Ugedal <ougedal@palantir.com>

odinuge force-pushed the ou-identity-leak branch from 16a8184 to c1ad50a Compare November 4, 2024 10:24

giorio94 approved these changes Nov 4, 2024

View reviewed changes

giorio94 requested a review from youngnick November 11, 2024 08:39

youngnick approved these changes Nov 18, 2024

View reviewed changes

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 18, 2024

aanm enabled auto-merge November 18, 2024 10:56

aanm added this pull request to the merge queue Nov 18, 2024

Merged via the queue into cilium:main with commit 844c1b3 Nov 18, 2024
66 checks passed

rastislavs mentioned this pull request Nov 20, 2024

v1.16 Backports 2024-11-20 #36066

Merged

14 tasks

rastislavs added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 20, 2024

github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 26, 2024

cilium-release-bot bot mentioned this pull request Dec 12, 2024

Prepare for release v1.16.5 #36557

Merged

odinuge mentioned this pull request Apr 10, 2025

Cilium Key Allocation Failure on High-Load Clusters #38809

Closed

allocator: Fix kvstore identity leak #34893

allocator: Fix kvstore identity leak #34893

Uh oh!

Conversation

odinuge commented Sep 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odinuge commented Sep 16, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 17, 2024

Uh oh!

odinuge commented Oct 17, 2024

Uh oh!

odinuge commented Oct 18, 2024

Uh oh!

Uh oh!

Uh oh!

odinuge commented Oct 21, 2024

Uh oh!

Uh oh!

giorio94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

odinuge commented Nov 1, 2024

Uh oh!

giorio94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

giorio94 commented Nov 4, 2024

Uh oh!

odinuge commented Nov 4, 2024

Uh oh!

giorio94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

odinuge commented Sep 16, 2024 •

edited

Loading