Skip to content

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented May 28, 2023

This pull request fixes IPsec drops of type XfrmOutPolBlock that would happen with all IPAM modes during upgrades. The root of the issue is that we would delete the old XFRM OUT policies and/or states instead of leaving them around for the duration of the upgrade.

This pull request makes various changes to ensure that we smoothly transition from the old IPsec implementation to the new. After this pull request, we shouldn't have any packet drops on upgrades when using ENI or Azure IPAM modes. Otherwise (for e.g. cluster-pool IPAM mode), it is theoretically possible to have a few drops of type XfrmOutNoStates; those however haven't been noticed in any of the upgrade tests. There is more information on those XfrmOutNoStates drops in the last commit of this pull request.

How was this tested?

The usual upgrade path from v1.13.0 to this patchset was tested on two clusters of 20 nodes each, on EKS and GKE. Before the upgrade, our drop-sensitive migrate-svc application was deployed with 50 clients and 30 backends. That application sends a continuous stream of messages back and forth and fails whenever a message is dropped.

After the upgrade to this patchset, I verified that:

  • All IPsec error counters were at zero.
  • None of the pods from the migrate-svc application had crashed.
  • The connectivity tests passed on the upgraded cluster.
  • No related errors or warnings were reported in the agent logs.

Fixes: #24773.
Fixes: #24030.

Fix a bug that would cause connectivity drops of type XfrmOutPolBlock on upgrade when IPsec is enabled.

@pchaigno pchaigno added kind/bug This is a bug in the Cilium logic. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. upgrade-impact This PR has potential upgrade or downgrade impact. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. needs-backport/1.11 labels May 28, 2023
@pchaigno pchaigno force-pushed the ipsec-ippools-upgrade branch 5 times, most recently from ff21f54 to 1db1553 Compare May 30, 2023 17:00
@pchaigno pchaigno changed the title ipsec: Fix XfrmOutPolBlock drops on ESK & AKS upgrades ipsec: Fix XfrmOutPolBlock drops upgrades May 30, 2023
@pchaigno pchaigno requested a review from jschwinger233 May 30, 2023 19:19
@pchaigno pchaigno marked this pull request as ready for review May 30, 2023 19:19
@pchaigno pchaigno requested review from a team as code owners May 30, 2023 19:19
@pchaigno pchaigno requested a review from tommyp1ckles May 30, 2023 19:19
Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to make xfrm states coexist is marvelous!
Just a newbee question: we coexist xfrm states, but update xfrm policies directly, and upgrade test still passes, so where do we use xfrm policy?

@pchaigno pchaigno force-pushed the ipsec-ippools-upgrade branch from 1db1553 to df366bb Compare May 31, 2023 12:55
@pchaigno
Copy link
Member Author

Just a newbee question: we coexist xfrm states, but update xfrm policies directly, and upgrade test still passes, so where do we use xfrm policy?

Actually, we don't update the XFRM policies directly. In the second commit, we deprioritize the old XFRM policies such that they can coexist with the new ones added as part of the upgrade. I've updated the commit descriptions to include example of how the XFRM configs will change.

If a packet comes into the XFRM subsystem, it will first be matched against the new XFRM policies with marks of the form 0xXXXXYe00/0xffffff00. If that doesn't match because the packet is still using the old mark format (because bpf_lxc hasn't been reloaded yet), then it will match against the deprioritized XFRM policies with marks of the form 0xYe00/0xff00.

Once we've matched against a policy, it will search for the matching XFRM state using the IP addresses from the policy's template. For example, with the below policies, it will look for an XFRM state with src 10.24.1.77 dst 10.24.2.207 that also matches the packet mark.

     src 10.24.1.0/24 dst 10.24.2.0/24
          dir out priority 50
          mark 0x3e00/0xff00
          tmpl src 10.24.1.77 dst 10.24.2.207
              proto esp spi 0x00000003 reqid 1 mode tunnel

@pchaigno pchaigno changed the title ipsec: Fix XfrmOutPolBlock drops upgrades ipsec: Fix XfrmOutPolBlock drops on upgrades May 31, 2023
@tommyp1ckles
Copy link
Contributor

Question: are the prioritized policies just left like this forever - could we not clean them up after some time?

I guess we'd have to know that all nodes have had their host endpoints reloaded before doing that...

(I'm wondering if leaving old policies around might provide other places for issues during future version upgrades

@tommyp1ckles
Copy link
Contributor

Agent endpoint changes seem fine to me, just a few questions, I'll take another look tomorrow morning.

This commit lowers the priority of the catch-all default-drop XFRM OUT
policies, from 1 to 100. For context, 0 is the highest possible
priority.

This change will allow us to introduce several levels of priorities for
XFRM OUT policies in subsequent commits.

Diff before/after this patch:

      src 0.0.0.0/0 dst 0.0.0.0/0
    -     dir out action block priority 1
    +     dir out action block priority 100
          mark 0xe00/0xf00

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 2, 2023
@pchaigno pchaigno merged commit c0d9b8c into cilium:main Jun 2, 2023
@pchaigno pchaigno deleted the ipsec-ippools-upgrade branch June 2, 2023 10:19
@pchaigno pchaigno added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.12 labels Jun 9, 2023
@qmonnet qmonnet added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Jun 14, 2023
ldelossa added a commit to ldelossa/cilium that referenced this pull request Jul 8, 2024
This code was introduced in cilium#25735.

As described well in:

16e446c
bpf: Support the old IP_POOLS logic in bpf_host

and

b429c42
daemon: Reload bpf_host first in case of IPsec upgrade

This code was put in place to ensure the refactored 'bpf_host' program,
which supports the IP_POOLS hack upgrade compatibility, is installed before any
other eBPF prog.

Now that we are N+2 versions from this introduction, and we instruct
individuals to upgrade consecutively, this code can be removed.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
ldelossa added a commit to ldelossa/cilium that referenced this pull request Jul 8, 2024
This code was introduced in cilium#25735.

As described well in:

16e446c
bpf: Support the old IP_POOLS logic in bpf_host

and

b429c42
daemon: Reload bpf_host first in case of IPsec upgrade

This code was put in place to ensure the refactored 'bpf_host' program,
which supports the IP_POOLS hack upgrade compatibility, is installed before any
other eBPF prog.

Now that we are N+2 versions from this introduction, and we instruct
individuals to upgrade consecutively, this code can be removed.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Jul 12, 2024
This code was introduced in #25735.

As described well in:

16e446c
bpf: Support the old IP_POOLS logic in bpf_host

and

b429c42
daemon: Reload bpf_host first in case of IPsec upgrade

This code was put in place to ensure the refactored 'bpf_host' program,
which supports the IP_POOLS hack upgrade compatibility, is installed before any
other eBPF prog.

Now that we are N+2 versions from this introduction, and we instruct
individuals to upgrade consecutively, this code can be removed.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. backport/author The backport will be carried out by the author of the PR. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. upgrade-impact This PR has potential upgrade or downgrade impact.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

5 participants