Skip to content

Reply from pod to outside is dropped when L7 ingress policy is used #21954

@brb

Description

@brb

This issue is about an L7 ingress policy problem a when a pod is reached directly from outside client / via a NodePort BPF service.

Let's consider the following L7 netpol:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: foobar
spec:
  description: "Allow to GET on echo from outside"
  endpointSelector:
    matchLabels:
      kind: echo
  ingress:
  - fromEntities:
    - "world"
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/$"

When the netpol is applied, accessing the echo pod from outside the cluster fails with:

xx drop (Stale or unroutable IP) flow 0xf863e34e to endpoint 0, file bpf_host.c line 665, , identity 28143->unknown: 10.0.1.49:80 -> 192.168.34.1:32884 tcp SYN, ACK

The drop is triggered by https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626.

What happens is that the L7 proxy sends the SYN-ACK which gets handled by bpf_host @ cilium_host, and then dropped. See the pwru output (ifindex=9 is cilium_host):

  SKB    CPU          PROCESS                     FUNC
0xffff9a859048e800      7        [<empty>]             ip_local_out netns=4026531992 mark=0xa00 ifindex=0 proto=0 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]           __ip_local_out netns=4026531992 mark=0xa00 ifindex=0 proto=0 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]             nf_hook_slow netns=4026531992 mark=0xa00 ifindex=0 proto=8 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]                ip_output netns=4026531992 mark=0xa00 ifindex=0 proto=8 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]             nf_hook_slow netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]  apparmor_ipv4_postroute netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]         ip_finish_output netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>] __cgroup_bpf_run_filter_skb netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]       __ip_finish_output netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]        ip_finish_output2 netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]           dev_queue_xmit netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]         __dev_queue_xmit netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]             tcf_classify netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]      skb_ensure_writable netns=4026531992 mark=0x0 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800      7        [<empty>]                kfree_skb netns=4026531992 mark=0x0 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)

The packet is sent to the cilium_host because of the mark and the following IP rules / routes:

$ ip rule list
...
10:     from all fwmark 0xa00/0xf00 lookup 2005

$ ip route show table 2005
default via 10.0.1.116 dev cilium_host
10.0.1.116 dev cilium_host scope link

One fix is to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to WORLD_ID.

Metadata

Metadata

Labels

area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.feature/ipv6Relates to IPv6 protocol supportkind/bugThis is a bug in the Cilium logic.pinnedThese issues are not marked stale by our issue bot.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions