Skip to content

Incompatibility between Istio ambient mode readiness probes and Calico eBPF dataplane #9157

@howardjohn

Description

@howardjohn

Expected Behavior

Istio ambient mode works with Calico in eBPF mode.

Current Behavior

Readiness probes fail with Istio ambient mode when Calico is configured to use eBPF mode. Istio issue: istio/istio#52765.

Calico setup is reproduced with kind, helm install calico projectcalico/tigera-operator --version v3.28.1 -f values.yaml --namespace tigera-operator --create-namespace, and

installation:
  enabled: true
defaultFelixConfiguration:
  enabled: true
  bpfEnabled: true
  bpfLogLevel: "Debug"

Istio installed with istio/istio-1.23.0/bin/istioctl install --set profile=ambient -y.

Simple reproducer app (any will do as long as it has a readiness probe).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
spec:
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      securityContext:
        sysctls:
        - name: net.ipv4.ip_unprivileged_port_start
          value: "0"
      containers:
      - name: echo
        image: gcr.io/istio-testing/app:latest
        imagePullPolicy: IfNotPresent
        readinessProbe:
          httpGet:
            port: 12345
          periodSeconds: 1
        args:
        - --port=12345

With this, all requests from the hostNetwork to the pod fail. In the most obvious case this is kubelet health checks, but applies to any requests from the host.

How the networking path works is Istio inserts a host iptables rule. These look like:

*nat
:ISTIO_POSTRT - [0:0]
-A POSTROUTING -j ISTIO_POSTRT
-A ISTIO_POSTRT -p tcp -m owner --socket-exists -m set --match-set istio-inpod-probes-v4 dst -j SNAT --to-source 169.254.7.127

We see the packets are dropped on the return path:

pwru (probe is on 12345):

10.244.190.134:12345->169.254.7.127:39500(tcp)  kfree_skb_reason(SKB_DROP_REASON_TC_EGRESS)

Trace logs (I did curl instead of kubelet, same thing though. local port 12346, health check port 12345)

            curl-1954511 [010] b.s2. 617201.255780: bpf_trace_printk: eth0------------E: New packet at ifindex=113; mark=1000000
            curl-1954511 [010] b.s2. 617201.255781: bpf_trace_printk: eth0------------E: IP id=0 len=60
            curl-1954511 [010] b.s2. 617201.255782: bpf_trace_printk: eth0------------E: IP s=af4be84 d=a9fe077f
            curl-1954511 [010] b.s2. 617201.255782: bpf_trace_printk: eth0------------E: IP ihl=20 bytes
            curl-1954511 [010] b.s2. 617201.255783: bpf_trace_printk: eth0------------E: TCP; ports: s=12345 d=12346
            curl-1954511 [010] b.s2. 617201.255784: bpf_trace_printk: eth0------------E: CT: lookup from af4be84:12345
            curl-1954511 [010] b.s2. 617201.255784: bpf_trace_printk: eth0------------E: CT: lookup to   a9fe077f:12346
            curl-1954511 [010] b.s2. 617201.255785: bpf_trace_printk: eth0------------E: CT: Hit! NORMAL entry.
            curl-1954511 [010] b.s2. 617201.255786: bpf_trace_printk: eth0------------E: CT: Packet not allowed by ingress/egress approval flags (FH).
            curl-1954511 [010] b.s2. 617201.255787: bpf_trace_printk: eth0------------E: CT: result: 0x6
            curl-1954511 [010] b.s2. 617201.255787: bpf_trace_printk: eth0------------E: conntrack entry flags 0x0
            curl-1954511 [010] b.s2. 617201.255787: bpf_trace_printk: eth0------------E: CT Hit
            curl-1954511 [010] b.s2. 617201.255788: bpf_trace_printk: eth0------------E: jump to idx 2 prog at 21
            curl-1954511 [010] b.s2. 617201.255789: bpf_trace_printk: eth0------------E: Entering calico_tc_skb_accepted_entrypoint
            curl-1954511 [010] b.s2. 617201.255790: bpf_trace_printk: eth0------------E: Entering calico_tc_skb_accepted
            curl-1954511 [010] b.s2. 617201.255790: bpf_trace_printk: eth0------------E: src=af4be84 dst=a9fe077f
            curl-1954511 [010] b.s2. 617201.255791: bpf_trace_printk: eth0------------E: post_nat=0:0
            curl-1954511 [010] b.s2. 617201.255791: bpf_trace_printk: eth0------------E: tun_ip=0
            curl-1954511 [010] b.s2. 617201.255791: bpf_trace_printk: eth0------------E: pol_rc=1
            curl-1954511 [010] b.s2. 617201.255792: bpf_trace_printk: eth0------------E: sport=12345
            curl-1954511 [010] b.s2. 617201.255792: bpf_trace_printk: eth0------------E: dport=12346
            curl-1954511 [010] b.s2. 617201.255792: bpf_trace_printk: eth0------------E: flags=20
            curl-1954511 [010] b.s2. 617201.255793: bpf_trace_printk: eth0------------E: ct_rc=6
            curl-1954511 [010] b.s2. 617201.255793: bpf_trace_printk: eth0------------E: ct_related=0
            curl-1954511 [010] b.s2. 617201.255793: bpf_trace_printk: eth0------------E: mark=0x1000000
            curl-1954511 [010] b.s2. 617201.255794: bpf_trace_printk: eth0------------E: ip->ttl 63
            curl-1954511 [010] b.s2. 617201.255795: bpf_trace_printk: eth0------------E: Final result=DENY (0). Program execution time: 14016ns

Possible Solution

Note: I am an Istio maintainer.

We are happy to make changes on the Istio side if needed to make this work, but right now we are not entirely clear what is breaking things. A theory we had is istio/istio#52765 (comment), but not sure

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions