-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
This issue is about an L7 ingress policy problem a when a pod is reached directly from outside client / via a NodePort BPF service.
Let's consider the following L7 netpol:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: foobar
spec:
description: "Allow to GET on echo from outside"
endpointSelector:
matchLabels:
kind: echo
ingress:
- fromEntities:
- "world"
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "GET"
path: "/$"
When the netpol is applied, accessing the echo
pod from outside the cluster fails with:
xx drop (Stale or unroutable IP) flow 0xf863e34e to endpoint 0, file bpf_host.c line 665, , identity 28143->unknown: 10.0.1.49:80 -> 192.168.34.1:32884 tcp SYN, ACK
The drop is triggered by https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626.
What happens is that the L7 proxy sends the SYN-ACK which gets handled by bpf_host @ cilium_host
, and then dropped. See the pwru
output (ifindex=9 is cilium_host
):
SKB CPU PROCESS FUNC
0xffff9a859048e800 7 [<empty>] ip_local_out netns=4026531992 mark=0xa00 ifindex=0 proto=0 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] __ip_local_out netns=4026531992 mark=0xa00 ifindex=0 proto=0 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] nf_hook_slow netns=4026531992 mark=0xa00 ifindex=0 proto=8 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] ip_output netns=4026531992 mark=0xa00 ifindex=0 proto=8 mtu=0 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] nf_hook_slow netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] apparmor_ipv4_postroute netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] ip_finish_output netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] __cgroup_bpf_run_filter_skb netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] __ip_finish_output netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] ip_finish_output2 netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=60 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] dev_queue_xmit netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] __dev_queue_xmit netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] tcf_classify netns=4026531992 mark=0xa00 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] skb_ensure_writable netns=4026531992 mark=0x0 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
0xffff9a859048e800 7 [<empty>] kfree_skb netns=4026531992 mark=0x0 ifindex=9 proto=8 mtu=1500 len=74 10.0.1.49:80->192.168.34.1:32884(tcp)
The packet is sent to the cilium_host
because of the mark and the following IP rules / routes:
$ ip rule list
...
10: from all fwmark 0xa00/0xf00 lookup 2005
$ ip route show table 2005
default via 10.0.1.116 dev cilium_host
10.0.1.116 dev cilium_host scope link
One fix is to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to WORLD_ID
.