Skip to content

Regression in v1.16.X: a reply to an outgoing connection from a pod is registered as new incoming on host level #35056

@Cajga

Description

@Cajga

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

When our pods initiate an outgoing connection to an external host the reply (with the syn-ack) is registered as a new incoming connection into the ct table (and would be dropped if we would not be in AUDIT mode).

Here is the cilium-dbg monitor of the traffic (it is a monitoring traffic that does an HTTP GET):

# cilium-dbg monitor |grep 172.27.210.23
Listening for events on 56 CPUs with 64x4096 of shared memory
Press Ctrl-C to quit
time="2024-09-26T14:45:06Z" level=info msg="Initializing dissection cache..." subsys=monitor
Policy verdict log: flow 0x8e7c81d1 local EP ID 3807, remote ID 16777226, proto 6, egress, action allow, auth: disabled, match L3-Only, 100.80.3.228:36946 -> 172.27.210.23:443 tcp SYN
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state new ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp SYN
Policy verdict log: flow 0x0 local EP ID 594, remote ID 16777226, proto 6, ingress, action audit, auth: disabled, match none, 172.27.210.23:443 -> 172.27.50.155:36946 tcp SYN, ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp SYN, ACK
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp ACK
-> endpoint 3807 flow 0x0 , identity 16777226->49712 state reply ifindex lxc7fb4017156e7 orig-ip 172.27.210.23: 172.27.210.23:443 -> 100.80.3.228:36946 tcp ACK, FIN
-> stack flow 0x8e7c81d1 , identity 49712->16777226 state established ifindex 0 orig-ip 0.0.0.0: 100.80.3.228:36946 -> 172.27.210.23:443 tcp ACK, FIN

The same traffic registers two entries in the ct map (the second is a TCP IN which is clearly wrong):

# cilium-dbg bpf ct list global |grep 100.80.3.228:36946
TCP OUT 100.80.3.228:36946 -> 172.27.210.23:443 expires=3026763 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=3026753 TxFlagsSeen=0x1b LastTxReport=3026753 Flags=0x0013 [ RxClosing TxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=49712 IfIndex=0 BackendID=0 
# cilium-dbg bpf ct list global |grep 172.27.50.155:36946
TCP IN 172.27.210.23:443 -> 172.27.50.155:36946 expires=3034753 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=3026753 TxFlagsSeen=0x00 LastTxReport=0 Flags=0x0011 [ RxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=16777226 IfIndex=0 BackendID=0 

While the "TCP OUT" entry disappears (I guess cleaned up as it finished properly), we have many "TCP IN" hanging there:

# cilium-dbg bpf ct list global|grep "^TCP IN 172.27.210.23:443"|wc -l
2368

We have cilium in default-deny and audit is enabled:

$ cilium config view |egrep  -e enable-policy -e policy-audit-mode
enable-policy                                     always
policy-audit-mode                                 true

Each and every traffic produces an entry in the audit log (where it says that the syn-ack packet is not a reply):

{
  "flow": {
    "time": "2024-09-26T12:17:05.377566426Z",
    "verdict": "AUDIT",
    "ethernet": {
      "source": "MAC1",
      "destination": "MAC2"
    },
    "IP": {
      "source": "172.27.210.23",
      "destination": "172.27.50.155",
      "ipVersion": "IPv4"
    },
    "l4": {
      "TCP": {
        "source_port": 443,
        "destination_port": 36582,
        "flags": {
          "SYN": true,
          "ACK": true
        }
      }
    },
    "source": {
      "identity": 16777226
    },
    "destination": {
      "identity": 1
    },
    "Type": "L3_L4",
    "node_name": "NODENAME",
    "event_type": {
      "type": 5
    },
    "is_reply": false,
    "Summary": "TCP Flags: SYN, ACK"
  },
  "node_name": "NODENAME",
  "time": "2024-09-26T12:17:05.377566426Z"
}

How can we reproduce the issue?

  1. Install cilium with cilium cli (note the hostFirewall.enabled=true):
cilium install --version=v1.16.0 --helm-set cni.exclusive=false --helm-set ipam.mode=kubernetes --helm-set identityAllocationMode=crd --helm-set tunnelProtocol=geneve --helm-set cni.chainingMode=portmap --helm-set prometheus.enabled=true --helm-set operator.prometheus.enabled=true --helm-set hostFirewall.enabled=true --helm-set operator.replicas=2 --helm-set policyAuditMode=true --helm-set policyEnforcementMode=always --helm-set hubble.enabled=true --helm-set hubble.export.static.enabled=true --helm-set hubble.export.fileMaxSizeMb=100 --helm-set hubble.export.static.filePath=/var/run/cilium/hubble/events.log --helm-set hubble.export.static.allowList[0]='\{"verdict":["DROPPED"\,"ERROR"\,"AUDIT"]\}' --helm-set hubble.export.static.fieldMask="{time,source.identity,source.namespace,source.pod_name,destination.identity,destination.namespace,destination.pod_name,source_service,destination_service,l4,IP,ethernet,l7,Type,node_name,is_reply,event_type,verdict,Summary}"
  1. enable hubble manually:
cilium hubble enable
  1. create a pod
  2. use curl and target a node outside of the cluster

Cilium Version

$ cilium version
cilium-cli: v0.16.13 compiled with go1.22.5 on linux/amd64
cilium image (default): v1.15.6
cilium image (stable): v1.16.2
cilium image (running): 1.16.0

Kernel Version

# uname -a
Linux NODENAME 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

$ kubectl version
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.1-eks-e564799

NOTE: we are using EKS Anywhere that runs on our own HW

Regression

seems to be a regression as the problem is not there with v1.15.9

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/host-firewallImpacts the host firewall or the host endpoint.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.kind/regressionThis functionality worked fine before, but was broken in a newer release of Cilium.needs/triageThis issue requires triaging to establish severity and next steps.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions