Skip to content

Conversation

odinuge
Copy link
Member

@odinuge odinuge commented Sep 16, 2024

This fixes an identity leak in the kvstore implementation that has been here since the beginning. If the kvstore-refresh sync dumps the loadKeys immediately before a key is released, it will try to update that key. In large clusters with a lot of identity and pod churn, we see this
happening multiple times an hour. It can be investigated by looking at
the "Re-created missing slave key" logline. This often leads to multiple
thousand extra identities in the clusters, that are unused and not
cleaned up.

This can also be easily mitigated by always grabbing the slaveKeysMutex - but that is most likely going to cause a lot of slowness to identity allocation on the agents, so that is probably not a good idea.

The term slave and master key is also a bit confusing, and sometimes its explained as ID and reference (eg. via acquire reference) - so getting a consistent and better naming scheme would be useful for the readability of the code. We can probably save that for another PR.

See the commits for a more detailed explanation.


Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

Fixes: #35451

Fix identity leak for kvstore identity mode

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 16, 2024
@odinuge
Copy link
Member Author

odinuge commented Sep 16, 2024

/test

Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Oct 17, 2024
@odinuge
Copy link
Member Author

odinuge commented Oct 17, 2024

Ahh. I'll rebase this and fix the conflict as well as remove the extra code in #34893 (comment) tomorrow morning.

@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Oct 18, 2024
@odinuge
Copy link
Member Author

odinuge commented Oct 18, 2024

Got stuck with other work, but did the rebase now. Will read through on Monday and then remove the draft mark.

@odinuge odinuge marked this pull request as ready for review October 21, 2024 08:47
@odinuge odinuge requested review from a team as code owners October 21, 2024 08:48
@odinuge odinuge requested review from giorio94 and youngnick October 21, 2024 08:48
@odinuge
Copy link
Member Author

odinuge commented Oct 21, 2024

After running the reproduction script in #35451 with this code, with a fix- pod prefix I get;

The following while the script creating and deleting pods is running;

$ kubectl exec -n kube-system daemonset/cilium -- cilium kvstore get --recursive "cilium/state/identities/v1/value/" 2>&1 |egrep "fix-"
cilium/state/identities/v1/value/k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default;k8s:io.cilium.k8s.policy.cluster=default;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=default;k8s:run=fix-745;/172.18.0.2 => 1377

After stopping, and all pods are deleted;

$ kubectl exec -n kube-system daemonset/cilium -- cilium kvstore get --recursive "cilium/state/identities/v1/value/" 2>&1 |egrep "fix-"
<no output>

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for discovering this issue and proposing a fix. Your patch overall looks great to me -- I've just left a few minor comments inline.

@giorio94 giorio94 added kind/bug This is a bug in the Cilium logic. affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. area/kvstore Impacts the KVStore package interactions. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Oct 25, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 25, 2024
@odinuge
Copy link
Member Author

odinuge commented Nov 1, 2024

Thanks for looking at this @giorio94! Pushed a new version now.

Been running in a loop locally and seems to work as expected!

Running locally I see some of these (with a tiny kvstore-sync value);

$ kubectl logs -n kube-system -l k8s-app=cilium  -f |grep "Re-created" -A 2 -B 2
time="2024-11-01T16:25:24.806888127Z" level=info msg="Released last local use of key, invoking global release" key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=kvstorebackend
time="2024-11-01T16:25:24.925191043Z" level=info msg="Removed endpoint" ciliumEndpointName=default/fix-101 containerID=d42039ca5d containerInterface= datapathPolicyRevision=3 desiredPolicyRevision=3 endpointID=127 identity=17055 ipv4=10.244.1.118 ipv6= k8sPodName=default/fix-101 subsys=endpoint
time="2024-11-01T16:25:25.17641821Z" level=warning msg="Re-created missing slave key" key="cilium/state/identities/v1/value/k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default;k8s:io.cilium.k8s.policy.cluster=default;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=default;k8s:run=fix-101;/172.18.0.3" subsys=kvstorebackend
time="2024-11-01T16:25:25.176465627Z" level=warning msg="Releasing now unused key that was re-recreated" id=17055 key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=allocator
time="2024-11-01T16:25:25.17650171Z" level=info msg="Released last local use of key, invoking global release" key="[k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:run=fix-101]" subsys=kvstorebackend

so it all seems to work as intended; and after all pods are gone, all the "slave keys" are released correctly and we have no leaks.

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great to me 🚀 Just a couple of nits inline.

@giorio94
Copy link
Member

giorio94 commented Nov 4, 2024

/test

The syncLocalKeys call will with some small probability re-create slave
keys for identities that are no longer used on the node. This often
happens immidiately after the last reference on the node is cleaned up,
and the slave key is deleted during the Release portion of the
allocator. This results in the slave key being present until the etcd
lease expires - and that usually doesn't happen until the agent
restarts.

This is not an issue for the master keys - as that is not using a
per-node lease, and is cleaned up by the operator at an interval.

In large clusters with a lot of identity and pod churn, we see this
happening multiple times an hour. It can be investigated by looking at
the "Re-created missing slave key" logline. This often leads to multiple
thousand extra identities in the clusters, that are unused and not
cleaned up.

This function will now ensure that as best as it can it won't upsert the
key if its no longer in use. And, if its unused after the upsert is
done, it grabs the lock to double check - and if its no longer in use,
it will release it to ensure its cleared up.

Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
@odinuge
Copy link
Member Author

odinuge commented Nov 4, 2024

/test

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@giorio94 giorio94 requested a review from youngnick November 11, 2024 08:39
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 18, 2024
@aanm aanm enabled auto-merge November 18, 2024 10:56
@aanm aanm added this pull request to the merge queue Nov 18, 2024
Merged via the queue into cilium:main with commit 844c1b3 Nov 18, 2024
66 checks passed
@rastislavs rastislavs mentioned this pull request Nov 20, 2024
14 tasks
@rastislavs rastislavs added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Nov 20, 2024
@github-actions github-actions bot added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch area/kvstore Impacts the KVStore package interactions. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvstore implementation leaks cilium identities during identity churn
6 participants