Skip to content

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Jul 9, 2025

[ upstream commit 92a3319 ]

Require TCP ACK flag to not be set when SYN is set to recreate a CT entry.

This addresses the problem where CT entry is created in the reply direction without a proxy redirect flag when a forward direction CT entry with proxy redirect flag already exists, when a packet from an Envoy upstream connection destination reaches bpf_lxc of the source pod.

The CT proxy redirect flag exists for the purpose of routing reply direction packets to the proxy when they reach the source pod's bpf_lxc program. Recreting the CT entry on the basis of the TCP SYN flag without requiring the ACK flag to be unset defeats this purpose and stalls traffic on source pod/Envoy (downstream) connection.

Example of creation of CT entry in the reply direction (only showing reply direction flows, SYN/ACK in the middle is for a proxy upstream connection that needs to be redirected to the proxy instead of the source pod):

-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0x61a4168c , identity 60249->44892 state new ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0x1fa7df2a , identity 60249->44892 state established ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK

CT entries, the second one being created in the reply direction:

TCP OUT 10.244.1.234:44892 -> 10.244.0.215:80 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1587142 TxFlagsSeen=0x00 LastTxReport=1587115 Flags=0x0051 [ RxClosing SeenNonSyn ProxyRedirect ] RevNAT=0 SourceSecurityID=44892 BackendID=0
TCP IN 10.244.0.215:80 -> 10.244.1.234:44892 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x12 LastRxReport=1587116 TxFlagsSeen=0x19 LastTxReport=1587142 Flags=0x0412 [ TxClosing SeenNonSyn FromTunnel ] RevNAT=0 SourceSecurityID=60249 BackendID=0

Requiring the ACK flag be cleared when seeing the SYN flag being set prevents the second CT entry from being created.

Related: #32653

Bpf datapath TCP conntrack entries are (re)created only in the forward direction, solving an issue with freezing proxy connections when backend connection is re-opened.

[ upstream commit 92a3319 ]

Require TCP ACK flag to not be set when SYN is set to recreate a CT entry.

This addresses the problem where CT entry is created in the reply
direction without a proxy redirect flag when a forward direction CT entry
with proxy redirect flag already exists, when a packet from an Envoy
upstream connection destination reaches bpf_lxc of the source pod.

The CT proxy redirect flag exists for the purpose of routing reply
direction packets to the proxy when they reach the source pod's bpf_lxc
program. Recreting the CT entry on the basis of the TCP SYN flag without
requiring the ACK flag to be unset defeats this purpose and stalls
traffic on source pod/Envoy (downstream) connection.

Example of creation of CT entry in the reply direction (only showing
reply direction flows, SYN/ACK in the middle is for a proxy upstream
connection that needs to be redirected to the proxy instead of the source
pod):

-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0x61a4168c , identity 60249->44892 state new ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK
-> endpoint 73 flow 0x1fa7df2a , identity 60249->44892 state established ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp SYN, ACK
-> endpoint 73 flow 0xba1ee241 , identity 60249->44892 state reply ifindex lxce2292d80c218 orig-ip 10.244.0.215: 10.244.0.215:80 -> 10.244.1.234:39194 tcp ACK

CT entries, the second one being created in the reply direction:

TCP OUT 10.244.1.234:44892 -> 10.244.0.215:80 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1587142 TxFlagsSeen=0x00 LastTxReport=1587115 Flags=0x0051 [ RxClosing SeenNonSyn ProxyRedirect ] RevNAT=0 SourceSecurityID=44892 BackendID=0
TCP IN 10.244.0.215:80 -> 10.244.1.234:44892 expires=1595143 Packets=0 Bytes=0 RxFlagsSeen=0x12 LastRxReport=1587116 TxFlagsSeen=0x19 LastTxReport=1587142 Flags=0x0412 [ TxClosing SeenNonSyn FromTunnel ] RevNAT=0 SourceSecurityID=60249 BackendID=0

Requiring the ACK flag be cleared when seeing the SYN flag being set
prevents the second CT entry from being created.

Related: cilium#32653
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme requested a review from a team as a code owner July 9, 2025 19:00
@jrajahalme jrajahalme added kind/backports This PR provides functionality previously merged into master. backport/1.17 This PR represents a backport for Cilium 1.17.x of a PR that was merged to main. labels Jul 9, 2025
@jrajahalme
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jul 10, 2025
@sayboras sayboras added this pull request to the merge queue Jul 10, 2025
Merged via the queue into cilium:v1.17 with commit 97e1275 Jul 10, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.17 This PR represents a backport for Cilium 1.17.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants