Skip to content

Cilium 1.16 identifies reply to outgoing traffic that leaves the cluster as new connection #35535

@Cajga

Description

@Cajga

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

Cilium 1.16.3 (hostfirewall enabled, encapsulated mode (geneve), with kube-proxy) identifies the reply to an outgoing connection that leaves the cluster as a new connection. This makes an extra (stale) entry in the ct map with every single connection leaving the cluster and the reply would be dropped by the firewall (we use default-deny). We reproduced the problem in a kind cluster so it can be simpler to debug.

Here are the cilium flowlogs for a single curl request made from a pod (note the 3rd line where Cilium identifies the reply with SYN, ACK as new connection):

# hubble observe flows -f |grep '79.172.255.103'
Oct 25 07:39:55.531: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) policy-verdict:none EGRESS AUDITED (TCP Flags: SYN)
Oct 25 07:39:55.531: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) to-stack FORWARDED (TCP Flags: SYN)
Oct 25 07:39:55.571: 79.172.255.103:80 (world) -> 172.18.0.3:36642 (host) policy-verdict:none INGRESS AUDITED (TCP Flags: SYN, ACK)
Oct 25 07:39:55.571: default/curly:36642 (ID:60372) <- 79.172.255.103:80 (world) to-endpoint FORWARDED (TCP Flags: SYN, ACK)
Oct 25 07:39:55.571: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) to-stack FORWARDED (TCP Flags: ACK)
Oct 25 07:39:55.571: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) to-stack FORWARDED (TCP Flags: ACK, PSH)
Oct 25 07:39:55.612: default/curly:36642 (ID:60372) <- 79.172.255.103:80 (world) to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Oct 25 07:39:55.612: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) to-stack FORWARDED (TCP Flags: ACK, FIN)
Oct 25 07:39:55.652: default/curly:36642 (ID:60372) <- 79.172.255.103:80 (world) to-endpoint FORWARDED (TCP Flags: ACK, FIN)
Oct 25 07:39:55.652: default/curly:36642 (ID:60372) -> 79.172.255.103:80 (world) to-stack FORWARDED (TCP Flags: ACK)

relevant ct map entries after few tests (note the two "TCP IN" which are actually a reply to our traffic):

# cilium-dbg bpf ct list global|grep 79.172.255.103
TCP IN 79.172.255.103:80 -> 172.18.0.3:36642 expires=9143 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1143 TxFlagsSeen=0x00 LastTxReport=0 Flags=0x0011 [ RxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=2 IfIndex=0 BackendID=0 
TCP IN 79.172.255.103:80 -> 172.18.0.3:55530 expires=9223 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1223 TxFlagsSeen=0x00 LastTxReport=0 Flags=0x0011 [ RxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=2 IfIndex=0 BackendID=0 
TCP OUT 10.244.1.156:50708 -> 79.172.255.103:80 expires=1948 Packets=0 Bytes=0 RxFlagsSeen=0x1b LastRxReport=1938 TxFlagsSeen=0x1b LastTxReport=1938 Flags=0x0013 [ RxClosing TxClosing SeenNonSyn ] RevNAT=0 SourceSecurityID=60372 IfIndex=0 BackendID=0 

Cilium config:

apiVersion: v1
data:
  agent-not-ready-taint-key: node.cilium.io/agent-not-ready
  arping-refresh-period: 30s
  auto-direct-node-routes: "false"
  bpf-events-drop-enabled: "true"
  bpf-events-policy-verdict-enabled: "true"
  bpf-events-trace-enabled: "true"
  bpf-lb-acceleration: disabled
  bpf-lb-external-clusterip: "false"
  bpf-lb-map-max: "65536"
  bpf-lb-sock: "false"
  bpf-lb-sock-terminate-pod-connections: "false"
  bpf-map-dynamic-size-ratio: "0.0025"
  bpf-policy-map-max: "16384"
  bpf-root: /sys/fs/bpf
  cgroup-root: /run/cilium/cgroupv2
  cilium-endpoint-gc-interval: 5m0s
  cluster-id: "0"
  cluster-name: kind-kind
  clustermesh-enable-endpoint-sync: "false"
  clustermesh-enable-mcs-api: "false"
  cni-chaining-mode: portmap
  cni-exclusive: "false"
  cni-log-file: /var/run/cilium/cilium-cni.log
  custom-cni-conf: "false"
  datapath-mode: veth
  debug: "false"
  debug-verbose: ""
  direct-routing-skip-unreachable: "false"
  dnsproxy-socket-linger-timeout: "10"
  egress-gateway-reconciliation-trigger-interval: 1s
  enable-auto-protect-node-port-range: "true"
  enable-bpf-clock-probe: "false"
  enable-endpoint-health-checking: "true"
  enable-external-ips: "false"
  enable-health-check-loadbalancer-ip: "false"
  enable-health-check-nodeport: "true"
  enable-health-checking: "true"
  enable-host-firewall: "true"
  enable-host-legacy-routing: "true"
  enable-host-port: "false"
  enable-hubble: "true"
  enable-ipv4: "true"
  enable-ipv4-big-tcp: "false"
  enable-ipv4-masquerade: "true"
  enable-ipv6: "false"
  enable-ipv6-big-tcp: "false"
  enable-ipv6-masquerade: "true"
  enable-k8s-networkpolicy: "true"
  enable-k8s-terminating-endpoint: "true"
  enable-l2-neigh-discovery: "true"
  enable-l7-proxy: "true"
  enable-local-redirect-policy: "false"
  enable-masquerade-to-route-source: "false"
  enable-metrics: "true"
  enable-node-port: "false"
  enable-node-selector-labels: "false"
  enable-policy: always
  enable-runtime-device-detection: "true"
  enable-sctp: "false"
  enable-svc-source-range-check: "true"
  enable-tcx: "true"
  enable-vtep: "false"
  enable-well-known-identities: "false"
  enable-xt-socket-fallback: "true"
  envoy-base-id: "0"
  envoy-keep-cap-netbindservice: "false"
  external-envoy-proxy: "true"
  hubble-disable-tls: "false"
  hubble-export-file-max-backups: "5"
  hubble-export-file-max-size-mb: "10"
  hubble-listen-address: :4244
  hubble-socket-path: /var/run/cilium/hubble.sock
  hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
  hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
  hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
  identity-allocation-mode: crd
  identity-gc-interval: 15m0s
  identity-heartbeat-timeout: 30m0s
  install-no-conntrack-iptables-rules: "false"
  ipam: kubernetes
  ipam-cilium-node-update-rate: 15s
  k8s-client-burst: "20"
  k8s-client-qps: "10"
  k8s-require-ipv4-pod-cidr: "false"
  k8s-require-ipv6-pod-cidr: "false"
  kube-proxy-replacement: "false"
  kube-proxy-replacement-healthz-bind-address: ""
  max-connected-clusters: "255"
  mesh-auth-enabled: "true"
  mesh-auth-gc-interval: 5m0s
  mesh-auth-queue-size: "1024"
  mesh-auth-rotated-identities-queue-size: "1024"
  monitor-aggregation: medium
  monitor-aggregation-flags: all
  monitor-aggregation-interval: 5s
  nat-map-stats-entries: "32"
  nat-map-stats-interval: 30s
  node-port-bind-protection: "true"
  nodeport-addresses: ""
  nodes-gc-interval: 5m0s
  operator-api-serve-addr: 127.0.0.1:9234
  operator-prometheus-serve-addr: :9963
  policy-audit-mode: "true"
  policy-cidr-match-mode: ""
  preallocate-bpf-maps: "false"
  procfs: /host/proc
  proxy-connect-timeout: "2"
  proxy-idle-timeout-seconds: "60"
  proxy-max-connection-duration-seconds: "0"
  proxy-max-requests-per-connection: "0"
  proxy-xff-num-trusted-hops-egress: "0"
  proxy-xff-num-trusted-hops-ingress: "0"
  remove-cilium-node-taints: "true"
  routing-mode: tunnel
  service-no-backend-response: reject
  set-cilium-is-up-condition: "true"
  set-cilium-node-taints: "true"
  synchronize-k8s-nodes: "true"
  tofqdns-dns-reject-response-code: refused
  tofqdns-enable-dns-compression: "true"
  tofqdns-endpoint-max-ip-per-hostname: "50"
  tofqdns-idle-connection-grace-period: 0s
  tofqdns-max-deferred-connection-deletes: "10000"
  tofqdns-proxy-response-max-delay: 100ms
  tunnel-protocol: geneve
  unmanaged-pod-watcher-interval: "15"
  vtep-cidr: ""
  vtep-endpoint: ""
  vtep-mac: ""
  vtep-mask: ""
  write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: cilium
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2024-10-25T07:34:45Z"
  labels:
    app.kubernetes.io/managed-by: Helm
  name: cilium-config
  namespace: kube-system
  resourceVersion: "804"
  uid: d4585746-8c01-4c35-ab17-12b212ccb332

How can we reproduce the issue?

  1. create kind cluster:
cat <<EOT >> kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  disableDefaultCNI: true        # do not install kindnet
nodes:
- role: control-plane
- role: worker
EOT
kind create cluster --config ./kind-config.yaml --image kindest/node:v1.30.4
  1. install cilium
cilium install --version=v1.16.3 --helm-set cni.exclusive=false --helm-set ipam.mode=kubernetes --helm-set identityAllocationMode=crd --helm-set tunnelProtocol=geneve --helm-set cni.chainingMode=portmap --helm-set hostFirewall.enabled=true --helm-set operator.replicas=2 --helm-set policyAuditMode=true --helm-set policyEnforcementMode=always --helm-set hubble.enabled=true
cilium status --wait
  1. enable hubble
cilium hubble enable
cilium status --wait
  1. run a pod
kubectl run -it --rm --image=curlimages/curl curly -- /bin/sh
  1. call a curl command and observe the hubble flow logs from another terminal
hubble observe flows -f |grep 'IPWHERETHECONNECTIONGOES'

Cilium Version

# cilium version
cilium-cli: v0.16.19 compiled with go1.23.1 on linux/amd64
cilium image (default): v1.16.2
cilium image (stable): v1.16.3
cilium image (running): 1.16.3

Kernel Version

# uname -a
Linux ip-10-100-0-143 6.8.0-1017-aws #18~22.04.1-Ubuntu SMP Thu Oct  3 19:57:42 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

# kubectl version
Client Version: v1.31.2
Kustomize Version: v5.4.2
Server Version: v1.30.4

Regression

Yes, the issue cannot be reproduced with 1.15.10

Sysdump

cilium-sysdump-20241025-075558.zip

Relevant log output

No response

Anything else?

This ticket obsoletes the old one that I opened as it contains all information in one place using a kind cluster.

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/host-firewallImpacts the host firewall or the host endpoint.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.kind/regressionThis functionality worked fine before, but was broken in a newer release of Cilium.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions