Skip to content

Conversation

aanm
Copy link
Member

@aanm aanm commented Sep 23, 2020

Partial backport of #13220

$ for pr in 13220; do contrib/backporting/set-labels.py $pr done 1.8; done
Delete Cilium Endpoints as soon the Pod stops running to avoid stale state.

@aanm aanm requested a review from a team as a code owner September 23, 2020 10:48
@maintainer-s-little-helper maintainer-s-little-helper bot added the kind/backports This PR provides functionality previously merged into master. label Sep 23, 2020
@aanm aanm changed the title [v1.8] k8s: delete IPs from ipcache for no running Pods [v1.8] k8s: delete CEPs for no running Pods Sep 23, 2020
@aanm aanm force-pushed the pr/v1.8-backport-13220 branch from 7a193df to 70934d3 Compare September 23, 2020 10:56
@aanm aanm added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Sep 23, 2020
}
if err := ciliumClient.CiliumEndpoints(namespace).Delete(ctx, podName, meta_v1.DeleteOptions{}); err != nil {
if !k8serrors.IsNotFound(err) {
scopedLog.WithError(err).Warning("Unable to delete CEP")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is best effort, ie if there's another error like lack of permissions then we will just give up rather than returning an error to the controller to ensure it gets retried?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that it doesn't look like the controller retires on stop functions. 😢 In any case, the operator will perform the GC of CEPs.

[ upstream commit b3adc4d ]

If an endpoint is not able to be restored we should delete the Cilium
Endpoint from Kubernetes to avoid having dangling Cilium Endpoints in
Kubernetes.

Signed-off-by: André Martins <andre@cilium.io>
…ted"

[ upstream commit 0fbd2b4 ]

This reverts commit 8068f1a.

This reverted commit introduces a regression where Cilium Endpoints can
be left around after the Cilium Endpoint was locally deleted. Although
it was a scale optimization for non existing docker images, the security
aspect will overlap the scalability concern initially thought.

Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit f1b61a7 ]

To avoid wasting resources in Cilium and to avoid leftover
CiliumEndpoints from populating the ipcache, we should not watch for
CiliumEndpoints when disable-endpoint-crd is set to true.

Signed-off-by: André Martins <andre@cilium.io>
Doing a return regardless of the error condition would prevent Cilium
Endpoints to be GCed in a single run.

Fixes: 37139c3 ("operator: remove pod list of an entire cluster")
Signed-off-by: André Martins <andre@cilium.io>
@aanm aanm force-pushed the pr/v1.8-backport-13220 branch from 70934d3 to 8e85f87 Compare September 25, 2020 12:10
@aanm
Copy link
Member Author

aanm commented Sep 25, 2020

test-backport-1.8

@qmonnet qmonnet merged commit 4293cbb into v1.8 Sep 26, 2020
@qmonnet qmonnet deleted the pr/v1.8-backport-13220 branch September 26, 2020 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants