-
Notifications
You must be signed in to change notification settings - Fork 3.4k
ci: skip plain-text TCP RST packets in bpftrace #36962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In #35485, we identified a leaked TCP RST packet generated from the kernel client in response to a FIN-ACK coming from the envoy proxy at the server-side. We noticed the behavior is due to a set of timers in the proxy that make it attempt closing (already closed) connections after specific intervals, causing the client to reply with a reset packet. The main issue is that such kernel-level packets don't contain the MARK we rely on to forward traffic from/to proxy, therefore letting the plain text packet through the default path. Let's tolerate such RST packets with this patch in all the cases we'd trace it as plain-text. That means: 1. $src_is_pod && $dst_is_pod 2. $pod_to_pod_via_proxy && ($src_is_pod || $dst_is_pod) Depending on the Cilium config, we observe leaked packets with source address either pod IP or ingress IP. The patch included both the two cases. Fixes: #35485 Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Since we have a scenario to reproduce: pod-to-pod-with-l7-policy-encryption for ipv6, maybe we can draft a separate test pr, add a commit to run ipv6 pod-to-pod-with-l7-policy-encryption on top of this, to verify the issue is really gone.
#35485 will be closed by merging this PR. I'll open a new issue (#37009) with our current understandings and proposed long term solution.
Plain TCP RST problem in CI is solved, we can merge this. |
I think it would be good to bring this back into all releases to stop the CI bleed. From the looks of it we'll also need #36364 to make this apply cleanly. |
Yep, I've got this on my todos, adding |
Bundled backport PRs are fine, if you're solving a class of issues or bring dependencies along (for instance #33575). |
That's an exact example of what I was looking for, many thanks. |
In #35485, we identified a leaked TCP RST packet generated from the kernel client in response to a FIN-ACK coming from the envoy proxy at the server-side. We noticed the behavior is due to a set of timers in the proxy that make it attempt closing (already closed) connections after specific intervals, causing the client to reply with a reset packet. The main issue is that such kernel-level packets don't contain the MARK we rely on to forward traffic from/to proxy, therefore letting the plain text packet through the default path.
Let's tolerate such RST packets with this patch in all the cases we'd trace it as plain-text. That means:
Depending on the Cilium config, we observe leaked packets with source address either pod IP or ingress IP. The patch included both the two cases.
Fixes: #35485