-
Notifications
You must be signed in to change notification settings - Fork 3.4k
bpf: Recreate CT entry in forward direction only #40427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/test |
borkmann
approved these changes
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Do we also have a Fixes tag with the commit that changed this behavior compared to 1.15? Would be nice to add the Fixes tag to the commit desc.
cef35d8
to
d2fd754
Compare
/test |
Require TCP ACK flag to not be set when SYN is set to recreate a CT entry. This addresses the problem where CT entry is created in the reply direction without a proxy redirect flag when a forward direction CT entry with proxy redirect flag already exists, when a packet from an Envoy upstream connection destination reaches bpf_lxc of the source pod. The CT proxy redirect flag exists for the purpose of routing reply direction packets to the proxy when they reach the source pod's bpf_lxc program. Recreting the CT entry on the basis of the TCP SYN flag without requiring the ACK flag to be unset defeats this purpose and stalls traffic on source pod/Envoy (downstream) connection. Example of creation of CT entry in the reply direction (only showing reply direction flows, SYN/ACK in the middle is for a proxy upstream connection that needs to be redirected to the proxy instead of the source pod): -> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK -> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK -> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK -> endpoint 73 flow 0x61a4168c , identity 60249->44892 state new ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK -> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK -> endpoint 73 flow 0x1fa7df2a , identity 60249->44892 state established ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK -> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK CT entries, the second one being created in the reply direction: TCP OUT 10.244.1.234:44892 -> 10.244.0.215:80 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1587142 TxFlagsSeen=0x00 LastTxReport=1587115 Flags=0x0051 [ RxClosing SeenNonSyn ProxyRedirect ] RevNAT=0 SourceSecurityID=44892 BackendID=0 TCP IN 10.244.0.215:80 -> 10.244.1.234:44892 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x12 LastRxReport=1587116 TxFlagsSeen=0x19 LastTxReport=1587142 Flags=0x0412 [ TxClosing SeenNonSyn FromTunnel ] RevNAT=0 SourceSecurityID=60249 BackendID=0 Requiring the ACK flag be cleared when seeing the SYN flag being set prevents the second CT entry from being created. Related: cilium#32653 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
d2fd754
to
2678877
Compare
rebased in hopes to get rid of a verifier test failure:
|
/test |
This was referenced Jul 9, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/datapath
Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
area/proxy
Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.
backport/author
The backport will be carried out by the author of the PR.
backport-done/1.16
The backport for Cilium 1.16.x for this PR is done.
backport-done/1.17
The backport for Cilium 1.17.x for this PR is done.
backport-done/1.18
The backport for Cilium 1.18.x for this PR is done.
kind/bug
This is a bug in the Cilium logic.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-blocker/1.16
This issue will prevent the release of the next version of Cilium.
release-blocker/1.17
This issue will prevent the release of the next version of Cilium.
release-blocker/1.18
This issue will prevent the release of the next version of Cilium.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Require TCP ACK flag to not be set when SYN is set to recreate a CT entry.
This addresses the problem where CT entry is created in the reply direction without a proxy redirect flag when a forward direction CT entry with proxy redirect flag already exists, when a packet from an Envoy upstream connection destination reaches bpf_lxc of the source pod.
The CT proxy redirect flag exists for the purpose of routing reply direction packets to the proxy when they reach the source pod's bpf_lxc program. Recreting the CT entry on the basis of the TCP SYN flag without requiring the ACK flag to be unset defeats this purpose and stalls traffic on source pod/Envoy (downstream) connection.
Example of creation of CT entry in the reply direction (only showing reply direction flows, SYN/ACK in the middle is for a proxy upstream connection that needs to be redirected to the proxy instead of the source pod):
CT entries, the second one being created in the reply direction:
Requiring the ACK flag be cleared when seeing the SYN flag being set prevents the second CT entry from being created.
Related: #32653