-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
When our pods initiate an outgoing connection to an external host the reply (with the syn-ack) is registered as a new incoming connection into the ct table (and would be dropped if we would not be in AUDIT mode).
Here is the cilium-dbg monitor
of the traffic (it is a monitoring traffic that does an HTTP GET):
# cilium-dbg monitor |grep 172.27.210.23
Listening for events on 56 CPUs with 64x4096 of shared memory
Press Ctrl-C to quit
time="2024-09-26T14:45:06Z" level=info msg="Initializing dissection cache..." subsys=monitor
Policy verdict log: flow 0x8e7c81d1 local EP ID 3807, remote ID 16777226, proto 6, egress, action allow, auth: disabled, match L3-Only, 100.80.3.228:36946 -> 172.27.210.23:443 tcp SYN
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state new ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp SYN
Policy verdict log: flow 0x0 local EP ID 594, remote ID 16777226, proto 6, ingress, action audit, auth: disabled, match none, 172.27.210.23:443 -> 172.27.50.155:36946 tcp SYN, ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp SYN, ACK
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp ACK, FIN
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK, FIN
The same traffic registers two entries in the ct map (the second is a TCP IN which is clearly wrong):
# cilium-dbg bpf ct list global |grep 100.80.3.228:36946
TCP OUT 100.80.3.228:36946 -> 172.27.210.23:443 expires=3026763 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=3026753 TxFlagsSeen=0x1b LastTxReport=3026753 Flags=0x0013 [ RxClosing TxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=49712 IfIndex=0 BackendID=0
# cilium-dbg bpf ct list global |grep 172.27.50.155:36946
TCP IN 172.27.210.23:443 -> 172.27.50.155:36946 expires=3034753 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=3026753 TxFlagsSeen=0x00 LastTxReport=0 Flags=0x0011 [ RxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=16777226 IfIndex=0 BackendID=0
While the "TCP OUT" entry disappears (I guess cleaned up as it finished properly), we have many "TCP IN" hanging there:
# cilium-dbg bpf ct list global|grep "^TCP IN 172.27.210.23:443"|wc -l
2368
We have cilium in default-deny and audit is enabled:
$ cilium config view |egrep -e enable-policy -e policy-audit-mode
enable-policy always
policy-audit-mode true
Each and every traffic produces an entry in the audit log (where it says that the syn-ack packet is not a reply):
{
"flow": {
"time": "2024-09-26T12:17:05.377566426Z",
"verdict": "AUDIT",
"ethernet": {
"source": "MAC1",
"destination": "MAC2"
},
"IP": {
"source": "172.27.210.23",
"destination": "172.27.50.155",
"ipVersion": "IPv4"
},
"l4": {
"TCP": {
"source_port": 443,
"destination_port": 36582,
"flags": {
"SYN": true,
"ACK": true
}
}
},
"source": {
"identity": 16777226
},
"destination": {
"identity": 1
},
"Type": "L3_L4",
"node_name": "NODENAME",
"event_type": {
"type": 5
},
"is_reply": false,
"Summary": "TCP Flags: SYN, ACK"
},
"node_name": "NODENAME",
"time": "2024-09-26T12:17:05.377566426Z"
}
How can we reproduce the issue?
- Install cilium with cilium cli (note the
hostFirewall.enabled=true
):
cilium install --version=v1.16.0 --helm-set cni.exclusive=false --helm-set ipam.mode=kubernetes --helm-set identityAllocationMode=crd --helm-set tunnelProtocol=geneve --helm-set cni.chainingMode=portmap --helm-set prometheus.enabled=true --helm-set operator.prometheus.enabled=true --helm-set hostFirewall.enabled=true --helm-set operator.replicas=2 --helm-set policyAuditMode=true --helm-set policyEnforcementMode=always --helm-set hubble.enabled=true --helm-set hubble.export.static.enabled=true --helm-set hubble.export.fileMaxSizeMb=100 --helm-set hubble.export.static.filePath=/var/run/cilium/hubble/events.log --helm-set hubble.export.static.allowList[0]='\{"verdict":["DROPPED"\,"ERROR"\,"AUDIT"]\}' --helm-set hubble.export.static.fieldMask="{time,source.identity,source.namespace,source.pod_name,destination.identity,destination.namespace,destination.pod_name,source_service,destination_service,l4,IP,ethernet,l7,Type,node_name,is_reply,event_type,verdict,Summary}"
- enable hubble manually:
cilium hubble enable
- create a pod
- use curl and target a node outside of the cluster
Cilium Version
$ cilium version
cilium-cli: v0.16.13 compiled with go1.22.5 on linux/amd64
cilium image (default): v1.15.6
cilium image (stable): v1.16.2
cilium image (running): 1.16.0
Kernel Version
# uname -a
Linux NODENAME 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
$ kubectl version
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.1-eks-e564799
NOTE: we are using EKS Anywhere that runs on our own HW
Regression
seems to be a regression as the problem is not there with v1.15.9
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct