-
Notifications
You must be signed in to change notification settings - Fork 3.4k
ipsec: Fix cleanup of XFRM states and policies #26072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
After applying a backport of 9cc8a89 ("ipsec: Fix leak of XFRM policies with ENI and Azure IPAMs") to 1.11.16, I noticed that we were getting occasional spikes of "no inbound state" xfrm errors (XfrmInNoStates). These lead to packet loss and brief outages for applications sending traffic to the node on which the spikes occur. I noticed that the "No node ID found for node." logline would appear at the time of these spikes and from the code this is logged when the node ID cannot be resolved. Looking a bit further the call to `DeleteIPsecEndpoint` will end up deleting the xfrm state for any state that matches the node id as derived from the mark in the state. The problem seems to be that the inbound state for 0.0.0.0/0 -> node IP has a mark of `0xd00` which when shifted >> 16 in `getNodeIDFromXfrmMark` matches nodeID 0 and so the inbound state gets deleted and the kernel drops all the inbound traffic as it no longer matches a state. This commit updates that logic to skip the XFRM state and policy deletion when the node ID is zero. Fixes: 9cc8a89 ("ipsec: Fix leak of XFRM policies with ENI and Azure IPAMs") Signed-off-by: Steven Johnson <sjdot@protonmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
With commit 9cc8a89 ("ipsec: Fix leak of XFRM policies with ENI and Azure IPAMs") we rely on the node ID to find XFRM states and policies that belong to remote nodes, to clean them up when remote nodes are deleted. This commit makes sure that we only do this for XFRM states and policies that actually match on these node IDs. That should only be the same if the mark mask matches on node ID bits. Thus if should look like 0xffffff00 (matches on node ID, SPI, and encryption bit) or 0xffff0f00 (matches on node and encryption bit). Fixes: 9cc8a89 ("ipsec: Fix leak of XFRM policies with ENI and Azure IPAMs") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
/test |
qmonnet
approved these changes
Jun 9, 2023
jschwinger233
approved these changes
Jun 9, 2023
Conformance Ginkgo hit #25964 |
I performed an IPsec upgrade on EKS from v1.13.0 to this patch, just in case. I also checked for XFRM leaks by doing a scale up and down from 3 to 9 nodes and back. Connectivity tests also passed (although the CI should already cover that anyway). |
This was referenced Jun 12, 2023
This was referenced Jun 14, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/encryption
Impacts encryption support such as IPSec, WireGuard, or kTLS.
backport-done/1.11
The backport for Cilium 1.11.x for this PR is done.
backport-done/1.12
The backport for Cilium 1.12.x for this PR is done.
backport-done/1.13
The backport for Cilium 1.13.x for this PR is done.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/misc
This PR makes changes that have no direct user impact.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See commits for details.
Fixes: #25784.