Skip to content

Conversation

joestringer
Copy link
Member

@joestringer joestringer commented Aug 9, 2023

Previously the node manager code only ever performed upserts into the
ipcache for remote-node labels on node creation/update events. However,
it never cleaned up these labels upon delete. As a result, if a node was
ever removed, then Cilium would continue to consider traffic towards
that IP as traffic reaching towards a remote node. In extreme cases,
this IP could be reallocated to another host in the network which could
cause connectivity disruption to that peer, particularly if the new peer
is subject to CIDR or Entities based policies.

Commits:

Related: #26958
Fixes: #27382

Passes the first two test cases here:
https://gist.github.com/joestringer/d7f55da5dd2f33be6148dd52a96a4a54

This reverts commit f9fe611.

The lock acquisition ordering in this original solution to the problem
could trigger deadlocks when the ipcache is concurrently updated by
multiple subsystems.

Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Aug 9, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.12 kind/backports This PR provides functionality previously merged into master. labels Aug 9, 2023
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks good to me.

Did you consider whether we need to also call RemoveMetadata in the NodeUpdated case which handles an update (old node -> new node) for the old node? The line I'm thinking about is here.

@joestringer joestringer force-pushed the pr/joe/fix-ipcache-node-leak-1-12 branch from 437bcaf to 5550013 Compare August 9, 2023 22:40
@joestringer
Copy link
Member Author

Did you consider whether we need to also call RemoveMetadata in the NodeUpdated case which handles an update (old node -> new node) for the old node? The line I'm thinking about is here.

Hmm, at a glance it looks like we should add it there too.

I have a reproducer via Maartje for the "delete node" case, but I'm not sure I understand how to reassign IPs for existing nodes in order to exercise that path.

@joestringer joestringer force-pushed the pr/joe/fix-ipcache-node-leak-1-12 branch from 5550013 to 8468e03 Compare August 9, 2023 23:15
@christarazi
Copy link
Member

christarazi commented Aug 9, 2023

I have a reproducer via Maartje for the "delete node" case, but I'm not sure I understand how to reassign IPs for existing nodes in order to exercise that path.

The easiest thing that comes to mind is manually doing a kubectl edit ciliumnode <name> and changing the node's IP from there.

@joestringer
Copy link
Member Author

/test-backport-1.12

@joestringer joestringer force-pushed the pr/joe/fix-ipcache-node-leak-1-12 branch from 8468e03 to 3572d3f Compare August 10, 2023 17:19
Previously the node manager code only ever performed upserts into the
ipcache for remote-node labels on node creation/update events. However,
it never cleaned up these labels upon delete. As a result, if a node was
ever removed, then Cilium would continue to consider traffic towards
that IP as traffic reaching towards a remote node. In extreme cases,
this IP could be reallocated to another host in the network which could
cause connectivity disruption to that peer, particularly if the new peer
is subject to CIDR or Entities based policies.

Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer force-pushed the pr/joe/fix-ipcache-node-leak-1-12 branch from 3572d3f to 067523f Compare August 10, 2023 18:38
@joestringer
Copy link
Member Author

/test-backport-1.12

@joestringer
Copy link
Member Author

Hit #27443 , #24840, and an issue where kubectl gets signal: killed during K8sIstioTest. None appear related to the PR.

@joestringer
Copy link
Member Author

/test-1.16-4.9

@joestringer
Copy link
Member Author

/test-1.21-4.9

@joestringer joestringer marked this pull request as ready for review August 11, 2023 02:05
@joestringer joestringer requested a review from a team as a code owner August 11, 2023 02:05
func (ipc *IPCache) RemoveMetadata(prefix string, lbls labels.Labels, src source.Source) {
ipc.metadata.Lock()
ipc.Lock()
_ = ipc.removeLabels(prefix, lbls, src)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up that this solves cases where there is a clear disassociation of the node first (eg node is deleted from k8s) before associating the node with another resource like a service, fqdn or kube-apiserver.

In order to solve the other order, we need to handle the resulting labels from this function. I think it's reasonable to review this PR as-is since this at least fixes two bugs (deadlock + deletion of the node followed by association with another resource), then we can follow up on the other case.

@asauber asauber self-requested a review August 11, 2023 17:58
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 11, 2023
@joestringer joestringer merged commit 6d1c07e into v1.12 Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants