-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Labels
affects/v1.14This issue affects v1.14 branchThis issue affects v1.14 branchaffects/v1.15This issue affects v1.15 branchThis issue affects v1.15 brancharea/encryptionImpacts encryption support such as IPSec, WireGuard, or kTLS.Impacts encryption support such as IPSec, WireGuard, or kTLS.area/kprAnything related to our kube-proxy replacement.Anything related to our kube-proxy replacement.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsarea/proxyImpacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.feature/wireguardRelates to Cilium's Wireguard featureRelates to Cilium's Wireguard featurekind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.Impacts whether traffic is allowed or denied based on user-defined policies.
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
Steps to reproduce
1. Create kind cilium cluster
make kind-down
export IMAGE=kindest/node:v1.29.4@sha256:3abb816a5b1061fb15c6e9e60856ec40d56b7b52bcea5f5f1350bc6e2320b6f8
./contrib/scripts/kind.sh --xdp --secondary-network "" 3 "" "" iptables dual 0.0.0.0 6443
kubectl patch node kind-worker3 --type=json -p='[{"op":"add","path":"/metadata/labels/cilium.io~1no-schedule","value":"true"}]'
git checkout 8cd748de7f4011d4ab9fc04338fa69c194205e6a # https://github.com/cilium/cilium/commit/8cd748de7f4011d4ab9fc04338fa69c194205e6a
make kind-image
kind load --name kind docker-image localhost:5000/cilium/cilium-dev:local
kind load --name kind docker-image localhost:5000/cilium/operator-generic:local
./cilium-cli install --wait --chart-directory=./install/kubernetes/cilium --helm-set=debug.enabled=true --helm-set=debug.verbose=envoy --helm-set=hubble.eventBufferCapacity=65535 --helm-set=bpf.monitorAggregation=none --helm-set=cluster.name=default --helm-set=authentication.mutual.spire.enabled=false --nodes-without-cilium --helm-set-string=kubeProxyReplacement=true --set='' --helm-set-string=routingMode=native --helm-set-string=autoDirectNodeRoutes=true --helm-set-string=ipv4NativeRoutingCIDR=10.244.0.0/16 --helm-set-string=ipv6NativeRoutingCIDR=fd00:10:244::/56 --helm-set=devices='{eth0,eth1}' --helm-set-string=loadBalancer.mode=snat --helm-set=ipv6.enabled=true --helm-set=bpf.masquerade=true --helm-set=egressGateway.enabled=true --helm-set=encryption.enabled=true --helm-set=encryption.type=wireguard --helm-set=encryption.nodeEncryption=true --helm-set=encryption.ipsec.encryptedOverlay=false --helm-set=ingressController.enabled=true --helm-set=ingressController.service.type=NodePort --helm-set=image.repository=localhost:5000/cilium/cilium-dev --helm-set=image.useDigest=false --helm-set=image.tag=local --helm-set=operator.image.repository=localhost:5000/cilium/operator --helm-set=operator.image.tag=local --helm-set operator.image.suffix= --helm-set=operator.image.useDigest=false --helm-set=image.pullPolicy=IfNotPresent --helm-set=operator.image.pullPolicy=IfNotPresent --helm-set=debug.verbose=datapath
./cilium-cli status --wait
./cilium-cli connectivity test --include-unsafe-tests --flush-ct --test "skipall" -v -p
2. Check pod to remote nodeport connectivity
# kubectl -ncilium-test get po -owide | grep client2
client2-ccd7b8bdf-dtdng 1/1 Running 0 32m 10.244.3.159 kind-worker2 <none> <none>
# kubectl -nkube-system get po -owide | grep cilium | grep kind-worker
cilium-2lwb2 1/1 Running 0 34m 172.19.0.5 kind-worker2 <none> <none>
cilium-9bn77 1/1 Running 0 34m 172.19.0.4 kind-worker <none> <none>
cilium-envoy-6s69m 1/1 Running 0 34m 172.19.0.5 kind-worker2 <none> <none>
cilium-envoy-wjhfh 1/1 Running 0 34m 172.19.0.4 kind-worker <none> <none>
cilium-operator-7b87f5b697-w88vj 1/1 Running 0 34m 172.19.0.4 kind-worker
# client2 pod is on kind-worker2, so we choose a remote nodeport on kind-worker
# kubectl -nkube-system exec cilium-9bn77 -- cilium service list | grep -i hostport
17 172.19.0.4:4000 HostPort 1 => 10.244.2.175:8080 (active)
18 [fc00:c111::4]:4000 HostPort 1 => [fd00:10:244:2::fc99]:8080 (active)
19 0.0.0.0:4000 HostPort 1 => 10.244.2.175:8080 (active)
20 [::]:4000 HostPort 1 => [fd00:10:244:2::fc99]:8080 (active)
# kubectl -ncilium-test exec client2-ccd7b8bdf-dtdng -- curl 172.19.0.4:4000 -I
HTTP/1.1 200 OK
X-Powered-By: Express
Vary: Origin, Accept-Encoding
Access-Control-Allow-Credentials: true
Accept-Ranges: bytes
Cache-Control: public, max-age=0
Last-Modified: Tue, 09 Jan 2024 12:57:12 GMT
ETag: W/"809-18cee4c6040"
Content-Type: text/html; charset=UTF-8
Content-Length: 2057
Date: Wed, 05 Jun 2024 03:38:14 GMT
Connection: keep-alive
Keep-Alive: timeout=5
3. Apply an L7 ingress policy and check connectivity again
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-8080-ingress
spec:
endpointSelector:
matchLabels:
kind: echo
ingress:
- fromEntities:
- all
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "GET"
path: "/$"
kubectl -ncilium-test apply -f allow-8080-ingress.yaml
Then pod to remote nodeport can't be connected:
$ kubectl -ncilium-test exec client2-ccd7b8bdf-dtdng -- curl 172.19.0.4:4000 -I -v
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 172.19.0.4:4000...
0 0 0 0 0 0 0 0 --:--:-- 0:00:09 --:--:-- 0^C
Cilium Version
Client: 1.16.0-dev 8cd748de7f 2024-05-20T11:00:02+09:00 go version go1.22.3 linux/amd64
Daemon: 1.16.0-dev 8cd748de7f 2024-05-20T11:00:02+09:00 go version go1.22.3 linux/amd64
Kernel Version
Linux liangzc-l-PF4RDLEQ 6.5.0-1023-oem #24-Ubuntu SMP PREEMPT_DYNAMIC Tue May 7 14:26:31 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
Regression
No response
Sysdump
No response
Relevant log output
No response
Anything else?
This bug is similar to #32897, caused by missing revDNAT for proxy's reply.
TLDR, the proxy's reply (tcp syn-ack, on behalf of dst pod) will be wireguard-encrypted ahead of revDNAT at to-netdev@eth0, then revDNAT code can't process the encrypted packet. The reply will finally be routed back to source pod, with the wrong src IP, ending up with SKB_DROP_REASON_NO_SOCKET.
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct
julianwiedmann
Metadata
Metadata
Assignees
Labels
affects/v1.14This issue affects v1.14 branchThis issue affects v1.14 branchaffects/v1.15This issue affects v1.15 branchThis issue affects v1.15 brancharea/encryptionImpacts encryption support such as IPSec, WireGuard, or kTLS.Impacts encryption support such as IPSec, WireGuard, or kTLS.area/kprAnything related to our kube-proxy replacement.Anything related to our kube-proxy replacement.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsarea/proxyImpacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.feature/wireguardRelates to Cilium's Wireguard featureRelates to Cilium's Wireguard featurekind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.Impacts whether traffic is allowed or denied based on user-defined policies.