-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Bug
Copy link
Labels
area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
When BPF host routing is enabled, the expected egress path for pod to world traffic is being bpf_redirected from lxc to eth0 directly. However, if tunnel is also enabled, traffic will be punt to stack.
How can we reproduce the issue?
- Setup kind
./contrib/scripts/kind.sh --xdp --secondary-network "" 3 "" "" none dual 0.0.0.0 6443
- Build image
# I'm at 736337ca18ae7f23541567a4452cdd362d035817, the main head by 17 Sep
make kind-image
- Create cilium
cli install --wait --chart-directory=./install/kubernetes/cilium --helm-set=debug.enabled=true --helm-set=debug.verbose=datapath --helm-set=hubble.eventBufferCapacity=65535 --helm-set=bpf.monitorAggregation=none --helm-set=cluster.name=default --helm-set=authentication.mutual.spire.enabled=false --nodes-without-cilium --helm-set-string=kubeProxyReplacement=true --helm-set-string=routingMode=tunnel --helm-set=devices={eth0} --helm-set-string=loadBalancer.mode=snat --helm-set=ipv6.enabled=true --helm-set=bpf.masquerade=true --helm-set=egressGateway.enabled=true --helm-set=ingressController.enabled=true --helm-set=debug.verbose=datapath --helm-set-string=tunnelProtocol=vxlan --helm-set=image.repository=localhost:5000/cilium/cilium-dev --helm-set=image.useDigest=false --helm-set=image.tag=local --helm-set=operator.image.repository=localhost:5000/cilium/operator --helm-set=operator.image.tag=local --helm-set operator.image.suffix= --helm-set=operator.image.useDigest=false --helm-set=image.pullPolicy=IfNotPresent --helm-set=operator.image.pullPolicy=IfNotPresent
- Deploy some pods
cilium-cli connectivity test --include-unsafe-tests --flush-ct --test "skipall"
- curl from a pod to 1.1.1.1
nspod client2-6b89df6c77-bqd6b curl 1.1.1.1
- Use pwru to check path
SKB CPU PROCESS NETNS MARK/x IFACE PROTO MTU LEN TUPLE FUNC
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 0 0x0000 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_local_out
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 0 0x0000 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) __ip_local_out
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 0 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_output
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) nf_hook_slow
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) apparmor_ip_postroute
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_finish_output
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) __ip_finish_output
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_finish_output2
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) neigh_resolve_output
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) __neigh_event_send
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) eth_header
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 60 10.244.2.93:19233->1.1.1.1:80(tcp) skb_push
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 74 10.244.2.93:19233->1.1.1.1:80(tcp) __dev_queue_xmit
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1450 74 10.244.2.93:19233->1.1.1.1:80(tcp) qdisc_pkt_len_init
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) netdev_core_pick_tx
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) validate_xmit_skb
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) netif_skb_features
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) passthru_features_check
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_network_protocol
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_csum_hwoffload_help
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) validate_xmit_xfrm
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) dev_hard_start_xmit
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_clone_tx_timestamp
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) __dev_forward_skb
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) __dev_forward_skb2
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_scrub_packet
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026534552 0 eth0:9 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) eth_type_trans
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) __netif_rx
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) netif_rx_internal
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) enqueue_to_backlog
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) __netif_receive_skb
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) __netif_receive_skb_one_core
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) tcf_classify
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_ensure_writable
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_ensure_writable
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_ensure_writable
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 0 ~ba50a31925f8:10 0x0800 1500 74 10.244.2.93:19233->1.1.1.1:80(tcp) skb_ensure_writable
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 9dcf0f00 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_rcv
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 9dcf0f00 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) ip_rcv_core
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 9dcf0f00 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) tcp_wfree
0xffff8f2b590d8ee8 0 ~in/curl:1483457 4026533134 9dcf0f00 ~ba50a31925f8:10 0x0800 1500 60 10.244.2.93:19233->1.1.1.1:80(tcp) nf_hook_slow
The last four functions ip_rcv
, ip_rcv_core
, nf_hook_slow
indicate the packet goes up to stack instead of being bpf_redirected.
Cilium Version
cilium $ cilium-cli version
cilium-cli: v0.16.13-63-g313d6a57 compiled with go1.22.0 on linux/amd64
cilium image (default): v1.16.0
cilium image (stable): v1.16.1
cilium image (running): 1.17.0-dev
Kernel Version
cilium $ uname -a
Linux liangzc-l-PF4RDLEQ 6.5.0-1027-oem #28-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 25 13:32:46 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
cilium $ k version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
Regression
Don't think it's a regression as v1.14.14 also has such issue.
Sysdump
No response
Relevant log output
No response
Anything else?
This is because encap_and_redirect_lxc()
returns DROP_NO_TUNNEL_ENDPOINT
for to-world packet, ending up with goto pass_to_stack
:
https://github.com/cilium/cilium/blob/v1.17.0-pre.0/bpf/bpf_lxc.c#L1267
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.