Skip to content

Enabling --enable-bpf-masquerade with enable-host-services=false when using kube-proxy prevents connectivity to ClusterIP services from remote hosts #12699

@joestringer

Description

@joestringer

Summary

When Cilium is configured with kube-proxy, hostNetwork pods (or applications on hosts) cannot establish connectivity with pods on remote nodes via ClusterIP. Regular pods are not affected by this issue.

Environment

  • Cilium v1.8.2
    • Using kube-proxy
  • Kernel 5.4
  • Kubernetes v1.17 via KIND v0.7.0

How to reproduce the issue

  1. Create 2-node KIND cluster from cilium git tree using kind create cluster --config=.github/kind-config.yaml
  2. Load Cilium via these instructions:
    - name: Load local images into kind cluster
    run: |
    kind load docker-image --name chart-testing cilium/cilium:latest
    kind load docker-image --name chart-testing cilium/operator-generic:latest
    - name: Install cilium chart
    run: |
    helm install cilium ./install/kubernetes/cilium \
    --wait \
    --namespace kube-system \
    --set global.nodeinit.enabled=true \
    --set global.kubeProxyReplacement=partial \
    --set global.hostServices.enabled=false \
    --set global.externalIPs.enabled=true \
    --set global.nodePort.enabled=true \
    --set global.hostPort.enabled=true \
    --set config.ipam=kubernetes \
    --set global.pullPolicy=Never

    (I later modified the image manually to v1.8.2 but you can docker pull ... + kind load docker-image ... to preload specific versions)
  3. Deploy connectivity check YAML
  4. Edit the host-to-echo-b-clusterip deployment via kubectl edit deploy host-to-b-multi-node-clusterip
    • Replace livenessProbe with readinessProbe
  5. Observe that the new pod created for this deployment never becomes ready

Symptoms

Source node sees SYN go out to the remote pod via the overlay:

root@kind-control-plane:/home/cilium# cilium monitor
...
-> overlay flow 0x88ee9aa7 identity 6->0 state new ifindex cilium_vxlan orig-ip 0.0.0.0: 172.17.0.3:28440 -> 10.244.1.175:80 tcp SYN

Destination node (kind-worker) sees the SYN, passes to pod, which responds with a SYN-ACK. This gets passed to the stack:

root@kind-worker:/home/cilium# cilium monitor
...
-> endpoint 351 flow 0xfd614d2c identity 6->13896 state new ifindex lxccc74c557b238 orig-ip 172.17.0.3: 172.17.0.3:29228 -> 10.244.1.175:80 tcp SYN
-> stack flow 0xeac287bd identity 13896->6 state reply ifindex 0 orig-ip 0.0.0.0: 10.244.1.175:80 -> 172.17.0.3:29228 tcp SYN, ACK

With tcpdump at the destination node (kind-worker), we see that the response is being SNAT'd to the node's IP:

# tcpdump -nvei eth0 | grep  172.17.0.3
...
    172.17.0.3.50507 > 10.244.1.175.80: Flags [S], cksum 0xb8e5 (incorrect -> 0x36d2), seq 2626802039, win 64240, options [mss 1460,sackOK,TS val 3387777011 ecr 0,nop,wscale 7], length 0
...
    172.17.0.2.57629 > 172.17.0.3.50507: Flags [S.], cksum 0x5856 (incorrect -> 0x72a4), seq 2606385031, ack 2626802040, win 64308, options [mss 1410,sackOK,TS val 901524270 ecr 3387777011,nop,wscale 7], length 0

Mitigation

I manually modified the cilium-config ConfigMap to configure enable-bpf-masquerade: "false" then restarted Cilium pods, and the connectivity check started working.

Metadata

Metadata

Assignees

Labels

area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/kube-proxyIssues related to kube-proxy (not the kube-proxy-free mode).area/loadbalancingImpacts load-balancing and Kubernetes service implementationsfeature/snatRelates to SNAT or Masquerading of traffickind/bugThis is a bug in the Cilium logic.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions