-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Open
Labels
affects/v1.14This issue affects v1.14 branchThis issue affects v1.14 branchaffects/v1.15This issue affects v1.15 branchThis issue affects v1.15 branchaffects/v1.16This issue affects v1.16 branchThis issue affects v1.16 brancharea/clustermeshRelates to multi-cluster routing functionality in Cilium.Relates to multi-cluster routing functionality in Cilium.area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.area/eniImpacts ENI based IPAM.Impacts ENI based IPAM.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.pinnedThese issues are not marked stale by our issue bot.These issues are not marked stale by our issue bot.
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
While investigating #20797, we found that the 7th bit of the skb->mark
used by Cilium to carry identity is used by AWS VPC CNI for PBR as well. As a result, the identity of the Cilium is overridden by iptables
rules in some cases and causes packet drop by the network policy.
There are two possible iptables
rules that overwrite the identity
- The rule in the
CILIUM_PRE_mangle
chain ofmangle
table installed by ENI mode Cilium (introduced in c7f9997)
-A CILIUM_PRE_mangle -i lxc+ -m comment --comment "cilium: primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
- The rule in the
PREROUTING
chain ofnat
table installed by AWS VPC CNI
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
And there are three possible setup scenarios affected by this issue
- Cilium is running with ENI mode after uninstalling AWS VPC CNI (e.g. create EKS cluster and install Cilium after that)
- Cilium is running with AWS VPC CNI with chaining mode
- Cilium is running with ENI mode without AWS VPC CNI from the beginning (e.g. self-hosted k8s cluster on EC2 hosts)
Reported by: @carloscastrojumo @EricMountain @hemanthmalla
Cilium Version
All Cilium versions after 1.9-rc1 affected
Kernel Version
Kernel version doesn't matter
Kubernetes Version
Kubernetes version doesn't matter
Sysdump
No response
Relevant log output
Investigation logs
- Traffic dropped for identity not found #20797 (comment)
- Traffic dropped for identity not found #20797 (comment)
Connectivity test logs with ClusterID = 128
📋 Test Report
❌ 12/23 tests failed (26/114 actions), 0 tests skipped, 1 scenarios skipped:
Test [allow-all-except-world]:
❌ allow-all-except-world/pod-to-pod/curl-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
❌ allow-all-except-world/pod-to-pod/curl-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
❌ allow-all-except-world/client-to-client/ping-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/client2-74f4559c78-xzg5n (192.168.101.165:0)
❌ allow-all-except-world/client-to-client/ping-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31:0)
❌ allow-all-except-world/pod-to-service/curl-0: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node (echo-same-node:8080)
❌ allow-all-except-world/pod-to-service/curl-1: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/echo-same-node (echo-same-node:8080)
Test [client-ingress]:
❌ client-ingress/client-to-client/ping-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31:0)
Test [echo-ingress]:
❌ echo-ingress/pod-to-pod/curl-0: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
Test [client-ingress-icmp]:
❌ client-ingress-icmp/client-to-client/ping-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31:0)
Test [echo-ingress-l7]:
❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-0-public: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-0-public (192.168.110.77:8080)
❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-0-private: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-0-private (192.168.110.77:8080)
❌ echo-ingress-l7/pod-to-pod-with-endpoints/curl-0-privatewith-header: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-0-privatewith-header (192.168.110.77:8080)
Test [echo-ingress-l7-named-port]:
❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-1-public: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-1-public (192.168.110.77:8080)
❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-1-private: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-1-private (192.168.110.77:8080)
❌ echo-ingress-l7-named-port/pod-to-pod-with-endpoints/curl-1-privatewith-header: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> curl-1-privatewith-header (192.168.110.77:8080)
Test [echo-ingress-from-other-client-deny]:
❌ echo-ingress-from-other-client-deny/pod-to-pod/curl-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
❌ echo-ingress-from-other-client-deny/client-to-client/ping-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/client2-74f4559c78-xzg5n (192.168.101.165:0)
❌ echo-ingress-from-other-client-deny/client-to-client/ping-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31:0)
Test [client-ingress-from-other-client-icmp-deny]:
❌ client-ingress-from-other-client-icmp-deny/pod-to-pod/curl-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
❌ client-ingress-from-other-client-icmp-deny/pod-to-pod/curl-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
❌ client-ingress-from-other-client-icmp-deny/client-to-client/ping-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/client2-74f4559c78-xzg5n (192.168.101.165:0)
Test [client-egress-to-echo-deny]:
❌ client-egress-to-echo-deny/client-to-client/ping-0: cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31) -> cilium-test/client2-74f4559c78-xzg5n (192.168.101.165:0)
❌ client-egress-to-echo-deny/client-to-client/ping-1: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/client-7bdbddd7b-7rrzq (192.168.22.31:0)
Test [client-ingress-to-echo-named-port-deny]:
❌ client-ingress-to-echo-named-port-deny/pod-to-pod/curl-0: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
Test [client-egress-to-echo-expression-deny]:
❌ client-egress-to-echo-expression-deny/pod-to-pod/curl-0: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
Test [client-egress-to-echo-service-account-deny]:
❌ client-egress-to-echo-service-account-deny/pod-to-pod/curl-0: cilium-test/client2-74f4559c78-xzg5n (192.168.101.165) -> cilium-test/echo-same-node-7894f8ffcd-vws46 (192.168.110.77:8080)
connectivity test failed: 12 tests failed
Anything else?
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
oblazek
Metadata
Metadata
Assignees
Labels
affects/v1.14This issue affects v1.14 branchThis issue affects v1.14 branchaffects/v1.15This issue affects v1.15 branchThis issue affects v1.15 branchaffects/v1.16This issue affects v1.16 branchThis issue affects v1.16 brancharea/clustermeshRelates to multi-cluster routing functionality in Cilium.Relates to multi-cluster routing functionality in Cilium.area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.area/eniImpacts ENI based IPAM.Impacts ENI based IPAM.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.pinnedThese issues are not marked stale by our issue bot.These issues are not marked stale by our issue bot.