-
Notifications
You must be signed in to change notification settings - Fork 3.4k
ciliumidentity resiliency improvement #28912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ciliumidentity resiliency improvement #28912
Conversation
Commit f6baedd does not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
2b58614
to
3761d25
Compare
/test |
1 similar comment
/test |
3761d25
to
2a0392a
Compare
/test |
d5e4786
to
c497e09
Compare
c497e09
to
ce82268
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall structure looks good to me. I've left a few comment inline
240ed3e
to
6cab386
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Just a couple more comments inline.
6cab386
to
6f35c61
Compare
/test |
Hitting: #26351 in e2e |
Seeing
on the ginkgo tests, seems like possibly a sporadic issue with the docker registry, retrying... |
6f35c61
to
1a04e3f
Compare
/test |
rebasing back one commit on main to see if that resolves/narrows down the gw conformance failures. |
/test |
1 similar comment
/test |
/test |
OnDelete k8s handler will now use a controller to reattempt master key recreation (i.e. situations where the owning agent sees a delete event on the k8s cid crd). This will use the localKeys store as a source of truth of local ciliumidentities, with sync attempts only happening if the numeric identity exists in localKeys. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
onDeleteLocked is used when draining the cache to remove all keys. In this case we want to remove all keys regardless of master keys. This switches this functionality to make the master key recreation optional, and only using this behavior when handling OnDelete events. Other uses of onDeleteLocked should always remove the identity locally. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
This test should cover: * OnDelete with master key protection enabled causing resilient sync of cid crd. * OnDelete without master key protection. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
These tests have been commented out for over four years, so we should probably just get rid of them. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
1a04e3f
to
452c640
Compare
/test |
@tommyp1ckles can we also have a release note for this one? Thanks! |
yup added |
Currently, the CiliumIdentity K8s watcher OnDelete handler has a mechanism to recreate wrongly deleted CiliumIdentities.
This is done by checking if:
enableMasterKeyProtection
is enabled (default=true).If so, K8s Delete events are handled by recreating the local CID object. However this is not done resiliently - where if there is a failure at that time with the create request to apiserver then the error is logged and the issue never resolved.
To improve this, this will perform the request attempt in a controller, terminating only if:
This will provide resiliency against issues where the apiserver is temporarily unavailable, the agent has been rate limited, etc.
Testing
Kind: ensuring ciliumidentities are recreated by deleting the cid and looking for relevant logs: