Skip to content

ipsec: Fix leak of XFRM OUT policies #25784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 1, 2023
Merged

Conversation

pchaigno
Copy link
Member

This pull request updates our cleanup logic for XFRM config when remote node are deleted, to fix a leak in Azure and ENI IPAM modes.

Fixes: #24030.

Fix a bug due to which we would leak Linux XFRM policies, potentially leading to increased CPU consumption, when IPsec is enabled with Azure or ENI IPAM.

pchaigno added 2 commits May 31, 2023 11:06
This commit simply refactors some existing code into a new
getNodeIDForNode function. This function will be called from elsewhere
in a subsequent commit.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Our logic to clean up old XFRM configs on node deletion currently relies
on the destination IP to identify the configs to remove. That doesn't
work with ENI and Azure IPAMs, but until recently, it didn't need to. On
ENI and Azure IPAMs we didn't have per-node XFRM configs.

That changed in commit 3e59b68 ("ipsec: Per-node XFRM states &
policies for EKS & AKS"). We now need to clean up per-node XFRM configs
for ENI and Azure IPAM modes as well, and we can't rely on the
destination IP for that because the XFRM policies don't match on that
destination IP.

Instead, since commit 73c36d4 ("ipsec: Match OUT XFRM states &
policies using node IDs"), we match the per-node XFRM configs using node
IDs encoded in the packet mark. The good news is that this is true for
all IPAM modes (whether Azure, ENI, cluster-pool, or something else).

So our cleanup logic can now rely on the node ID of the deleted node to
clean up its XFRM states and policies. This commit implements that.

Fixes: 3e59b68 ("ipsec: Per-node XFRM states & policies for EKS & AKS")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
@pchaigno pchaigno added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. area/eni Impacts ENI based IPAM. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. needs-backport/1.11 labels May 31, 2023
@pchaigno
Copy link
Member Author

/test

@pchaigno pchaigno requested a review from jschwinger233 May 31, 2023 11:05
@pchaigno pchaigno marked this pull request as ready for review May 31, 2023 11:05
@pchaigno pchaigno requested review from a team as code owners May 31, 2023 11:05
Copy link
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks Paul!

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 1, 2023
@pchaigno pchaigno merged commit 9cc8a89 into cilium:main Jun 1, 2023
@pchaigno pchaigno deleted the fix-xfrm-leak branch June 1, 2023 06:52
@pchaigno pchaigno added release-blocker/1.11 backport/author The backport will be carried out by the author of the PR. labels Jun 1, 2023
@pchaigno pchaigno added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.12 labels Jun 9, 2023
@qmonnet qmonnet added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. area/eni Impacts ENI based IPAM. backport/author The backport will be carried out by the author of the PR. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

4 participants