Skip to content

NodePort unreachable in kind+cilium+kubeproxyless #25479

@acuteaura

Description

@acuteaura

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

When using Kind with Cilium with kubeProxyReplacement=strict, NodePort services do not work.

Kind config:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: proxyless
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
  extraPortMappings:
  - containerPort: 30080
    hostPort: 80
    protocol: TCP
  - containerPort: 30443
    hostPort: 443
    protocol: TCP
networking:
  disableDefaultCNI: true
  kubeProxyMode: none

Cilium Helm values:

kubeProxyReplacement: strict
k8sServiceHost: proxyless-control-plane
k8sServicePort: 6443
hostServices:
  enabled: false
externalIPs:
  enabled: true
nodePort:
  enabled: true
hostPort:
  enabled: true
image:
  pullPolicy: IfNotPresent
ipam:
  mode: kubernetes
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: true
gatewayAPI:
  enabled: true
ingressController:
  service:
    type: NodePort

Values for Contour (the service I can't reach):

envoy:
  service:
    type: NodePort
    externalTrafficPolicy: Cluster
    nodePorts:
      http: 30080
      https: 30443
defaultBackend:
  enabled: true

The service I can't reach:

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: contour
    meta.helm.sh/release-namespace: projectcontour
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
  creationTimestamp: "2023-05-16T03:59:11Z"
  labels:
    app.kubernetes.io/component: envoy
    app.kubernetes.io/instance: contour
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: contour
    helm.sh/chart: contour-12.0.0
  name: contour-envoy
  namespace: projectcontour
  resourceVersion: "3345"
  uid: a22f99e1-31d7-4717-89f3-7df3a99df71d
spec:
  clusterIP: 10.96.90.240
  clusterIPs:
  - 10.96.90.240
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    nodePort: 30080
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    nodePort: 30443
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    app.kubernetes.io/component: envoy
    app.kubernetes.io/instance: contour
    app.kubernetes.io/name: contour
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

The service can't be reached via kind port forwards:

❯ curl localhost
(timeout)

The port on the node itself also can't be reached:

❯ docker inspect proxyless-worker | jq .[0].NetworkSettings.Networks.kind.IPAddress
"172.18.0.5"

❯ curl 172.18.0.5:30080
(timeout)

Notably, other ports do not timeout, but get a reject.

❯ curl 172.18.0.5:30081
curl: (7) Failed to connect to 172.18.0.5 port 30081 after 0 ms: Connection refused

And even more weirdly, the service is available at port 80 (on all nodes, not just the one with the kind extra ports):

❯ docker inspect proxyless-worker | jq .[0].NetworkSettings.Networks.kind.IPAddress
"172.18.0.5"

❯ curl 172.18.0.5 -I
HTTP/1.1 404 Not Found
vary: Accept-Encoding
date: Tue, 16 May 2023 04:21:47 GMT
server: envoy
transfer-encoding: chunked

❯ docker inspect proxyless-worker2 | jq .[0].NetworkSettings.Networks.kind.IPAddress
"172.18.0.2"

❯ curl 172.18.0.2 -I
HTTP/1.1 404 Not Found
vary: Accept-Encoding
date: Tue, 16 May 2023 04:21:47 GMT
server: envoy
transfer-encoding: chunked

Cilium Version

1.13.2

Kernel Version

Linux P14s 5.15.90.1-microsoft-standard-WSL2+ #2 SMP Fri May 12 12:26:01 CEST 2023 x86_64 x86_64 x86_64 GNU/Linux

(compiled according to https://wsl.dev/wslcilium/)

Kubernetes Version

Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-30T06:34:50Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

cilium-sysdump-20230516-061145.zip

Relevant log output

cilium service list

❯ kubectl -n kube-system exec ds/cilium -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID   Frontend            Service Type   Backend
1    10.96.95.218:8080   ClusterIP      1 => 10.244.2.46:8080 (active)
2    10.96.95.218:8801   ClusterIP      1 => 10.244.2.46:8801 (active)
3    10.96.201.130:80    ClusterIP      1 => 10.244.2.92:8080 (active)
4    10.96.0.1:443       ClusterIP      1 => 172.18.0.4:6443 (active)
5    10.96.0.10:53       ClusterIP      1 => 10.244.2.4:53 (active)
                                        2 => 10.244.2.48:53 (active)
6    10.96.0.10:9153     ClusterIP      1 => 10.244.2.4:9153 (active)
                                        2 => 10.244.2.48:9153 (active)
7    10.96.68.189:80     ClusterIP      1 => 10.244.2.118:8081 (active)
8    10.96.174.54:443    ClusterIP      1 => 172.18.0.2:4244 (active)
9    10.96.53.115:80     ClusterIP      1 => 10.244.2.147:4245 (active)
10   10.96.199.2:8080    ClusterIP      1 => 10.244.2.151:8080 (active)
11   10.96.199.2:8801    ClusterIP      1 => 10.244.2.151:8801 (active)
28   10.96.39.215:8001   ClusterIP      1 => 10.244.3.124:8001 (active)
29   10.96.67.144:80     ClusterIP
30   10.96.90.240:80     ClusterIP      1 => 10.244.1.183:8080 (active)
                                        2 => 10.244.3.198:8080 (active)
                                        3 => 10.244.2.247:8080 (active)
31   10.96.90.240:443    ClusterIP      1 => 10.244.1.183:8443 (active)
                                        2 => 10.244.3.198:8443 (active)
                                        3 => 10.244.2.247:8443 (active)
32   172.18.0.2:30080    NodePort       1 => 10.244.1.183:8080 (active)
                                        2 => 10.244.3.198:8080 (active)
                                        3 => 10.244.2.247:8080 (active)
33   0.0.0.0:30080       NodePort       1 => 10.244.1.183:8080 (active)
                                        2 => 10.244.3.198:8080 (active)
                                        3 => 10.244.2.247:8080 (active)
34   172.18.0.2:30443    NodePort       1 => 10.244.1.183:8443 (active)
                                        2 => 10.244.3.198:8443 (active)
                                        3 => 10.244.2.247:8443 (active)
35   0.0.0.0:30443       NodePort       1 => 10.244.1.183:8443 (active)
                                        2 => 10.244.3.198:8443 (active)
                                        3 => 10.244.2.247:8443 (active)
36   172.18.0.2:80       HostPort       1 => 10.244.1.183:8080 (active)
37   0.0.0.0:80          HostPort       1 => 10.244.1.183:8080 (active)
38   172.18.0.2:443      HostPort       1 => 10.244.1.183:8443 (active)
39   0.0.0.0:443         HostPort       1 => 10.244.1.183:8443 (active)

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions