Skip to content

Conversation

smagnani96
Copy link
Contributor

@smagnani96 smagnani96 commented Jan 13, 2025

In #35485, we identified a leaked TCP RST packet generated from the kernel client in response to a FIN-ACK coming from the envoy proxy at the server-side. We noticed the behavior is due to a set of timers in the proxy that make it attempt closing (already closed) connections after specific intervals, causing the client to reply with a reset packet. The main issue is that such kernel-level packets don't contain the MARK we rely on to forward traffic from/to proxy, therefore letting the plain text packet through the default path.

Let's tolerate such RST packets with this patch in all the cases we'd trace it as plain-text. That means:

  1. $src_is_pod && $dst_is_pod
  2. $pod_to_pod_via_proxy && ($src_is_pod || $dst_is_pod)

Depending on the Cilium config, we observe leaked packets with source address either pod IP or ingress IP. The patch included both the two cases.

Fixes: #35485

Skip tracking unmarked plain-text TCP RST packets generated from proxy timeouts in the CI bpftrace script.

In #35485, we identified a leaked TCP RST packet generated from the
kernel client in response to a FIN-ACK coming from the envoy proxy at the
server-side. We noticed the behavior is due to a set of timers in the
proxy that make it attempt closing (already closed) connections after
specific intervals, causing the client to reply with a reset packet.
The main issue is that such kernel-level packets don't contain the MARK
we rely on to forward traffic from/to proxy, therefore letting the plain
text packet through the default path.

Let's tolerate such RST packets with this patch in all the cases we'd
trace it as plain-text. That means:

1. $src_is_pod && $dst_is_pod
2. $pod_to_pod_via_proxy && ($src_is_pod || $dst_is_pod)

Depending on the Cilium config, we observe leaked packets with source
address either pod IP or ingress IP. The patch included both the two cases.

Fixes: #35485

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 13, 2025
@smagnani96
Copy link
Contributor Author

/test

Copy link
Member

@jschwinger233 jschwinger233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Since we have a scenario to reproduce: pod-to-pod-with-l7-policy-encryption for ipv6, maybe we can draft a separate test pr, add a commit to run ipv6 pod-to-pod-with-l7-policy-encryption on top of this, to verify the issue is really gone.

#35485 will be closed by merging this PR. I'll open a new issue (#37009) with our current understandings and proposed long term solution.

@smagnani96
Copy link
Contributor Author

Plain TCP RST problem in CI is solved, we can merge this.
However, I opened a new issue (#37051) for a new problem while trying to enable, upon this fix, the pod-to-pod-with-l7-policy-encryption test. The new problem is unrelated to this plain TCP RST issue.

@smagnani96 smagnani96 added area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. release-note/ci This PR makes changes to the CI. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jan 17, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Jan 17, 2025
@smagnani96 smagnani96 added kind/enhancement This would improve or streamline existing functionality. feature/ipsec Relates to Cilium's IPsec feature and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Jan 17, 2025
@smagnani96 smagnani96 marked this pull request as ready for review January 17, 2025 17:21
@smagnani96 smagnani96 requested review from a team as code owners January 17, 2025 17:21
@aanm aanm added this pull request to the merge queue Jan 17, 2025
Merged via the queue into main with commit 09551d2 Jan 17, 2025
297 of 298 checks passed
@aanm aanm deleted the pr/smagnani96/check-ipsec-leak-tolerate-rst branch January 17, 2025 19:00
@julianwiedmann
Copy link
Member

I think it would be good to bring this back into all releases to stop the CI bleed. From the looks of it we'll also need #36364 to make this apply cleanly.

@julianwiedmann julianwiedmann added needs-backport/1.14 needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Jan 20, 2025
@smagnani96 smagnani96 added the backport/author The backport will be carried out by the author of the PR. label Jan 20, 2025
@smagnani96
Copy link
Contributor Author

I think it would be good to bring this back into all releases to stop the CI bleed. From the looks of it we'll also need #36364 to make this apply cleanly.

Yep, I've got this on my todos, adding backport/author.
Should the backports be in separate PRs though, right?

@julianwiedmann
Copy link
Member

I think it would be good to bring this back into all releases to stop the CI bleed. From the looks of it we'll also need #36364 to make this apply cleanly.

Yep, I've got this on my todos, adding backport/author. Should the backports be in separate PRs though, right?

Bundled backport PRs are fine, if you're solving a class of issues or bring dependencies along (for instance #33575).

@smagnani96
Copy link
Contributor Author

Bundled backport PRs are fine, if you're solving a class of issues or bring dependencies along (for instance #33575).

That's an exact example of what I was looking for, many thanks.

@smagnani96 smagnani96 removed the backport/author The backport will be carried out by the author of the PR. label Jan 21, 2025
@rastislavs rastislavs mentioned this pull request Jan 22, 2025
19 tasks
@rastislavs rastislavs added backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. and removed needs-backport/1.16 This PR / issue needs backporting to the v1.16 branch labels Jan 22, 2025
@rastislavs rastislavs mentioned this pull request Jan 22, 2025
6 tasks
@rastislavs rastislavs added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 labels Jan 22, 2025
@rastislavs rastislavs mentioned this pull request Jan 22, 2025
4 tasks
@rastislavs rastislavs added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 labels Jan 22, 2025
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. backport-pending/1.16 The backport for Cilium 1.16.x for this PR is in progress. labels Jan 22, 2025
@rastislavs rastislavs mentioned this pull request Jan 24, 2025
14 tasks
@rastislavs rastislavs added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Jan 24, 2025
@github-actions github-actions bot added backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. feature/ipsec Relates to Cilium's IPsec feature kind/enhancement This would improve or streamline existing functionality. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

plain TCP RST packet in pod-to-pod-with-l7-policy-encryption for IPv6+IPSec+VXLan
5 participants