Skip to content

l7lb: Same pod routing issue #39531

@sayboras

Description

@sayboras

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.17.3 and lower than v1.18.0

What happened?

L7 LB via envoy is unable to route traffic on the same pod i.e. client and server are on the same pod.

How can we reproduce the issue?

Install cilium with the below config

loadBalancer:
  l7:
    backend: envoy

Deploy the below manifest

Workload
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-v1
  labels:
    app: echo
spec:
  selector:
    matchLabels:
      app: echo
      version: v1
  template:
    metadata:
      labels:
        app: echo
        version: v1
    spec:
      containers:
      - name: echo
        image: gcr.io/k8s-staging-gateway-api/echo-advanced:v20240412-v1.0.0-394-g40c666fd
        imagePullPolicy: IfNotPresent
        args:
        - --tcp=9090
        - --port=8080
        - --grpc=7070
        - --port=8443
        - --tls=8443
        - --crt=/cert.crt
        - --key=/cert.key
---
apiVersion: v1
kind: Service
metadata:
  name: echo-v1
spec:
  selector:
    app: echo
    version: v1
  ports:
  - name: http
    port: 80
    appProtocol: http
    targetPort: 8080
  - name: http-alt
    port: 8080
    appProtocol: http
  - name: https
    port: 443
    targetPort: 8443
  - name: tcp
    port: 9090
  - name: grpc
    port: 7070
    appProtocol: grpc
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-v2
  labels:
    app: echo
spec:
  selector:
    matchLabels:
      app: echo
      version: v2
  template:
    metadata:
      labels:
        app: echo
        version: v2
    spec:
      containers:
      - name: echo
        image: gcr.io/k8s-staging-gateway-api/echo-advanced:v20240412-v1.0.0-394-g40c666fd
        imagePullPolicy: IfNotPresent
        args:
        - --tcp=9090
        - --port=8080
        - --grpc=7070
        - --port=8443
        - --tls=8443
        - --crt=/cert.crt
        - --key=/cert.key
---
apiVersion: v1
kind: Service
metadata:
  name: echo-v2
spec:
  selector:
    app: echo
    version: v2
  ports:
  - name: http
    port: 80
    appProtocol: http
    targetPort: 8080
  - name: http-alt
    port: 8080
    appProtocol: http
  - name: https
    port: 443
    targetPort: 8443
  - name: tcp
    port: 9090
  - name: grpc
    port: 7070
    appProtocol: grpc
---
apiVersion: v1
kind: Service
metadata:
  name: echo
  annotations:
    "service.cilium.io/lb-l7": "enabled"
spec:
  selector:
    app: echo
    version: v1
  ports:
  - name: http
    port: 80
    appProtocol: http
    targetPort: 8080

Send the traffic from echo-1 pod to echo service

Cilium Version

The issue is observed in 1.16.0+

Kernel Version

N/A

Kubernetes Version

N/A

Regression

N/A

Sysdump

No response

Relevant log output

Cilium Monitor output, the SYN packet is dropped 10.244.1.157:37122 -> 10.244.1.157:8080


root@kind-worker:/home/cilium# cilium monitor --debug --related-to 167
Listening for events on 56 CPUs with 64x4096 of shared memory
Press Ctrl-C to quit
time=2025-05-08T11:12:21Z level=info source=/home/tammach/go/src/github.com/cilium/cilium/pkg/monitor/dissect.go:57 msg="Initializing dissection cache..."
<- endpoint 167 flow 0x275e899c , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:44595 -> 10.244.1.218:53 udp
<- endpoint 167 flow 0x275e899c , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:44595 -> 10.244.1.218:53 udp
-> endpoint 167 flow 0x0 , identity 55362->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.244.1.218: 10.244.1.218:53 -> 10.244.1.157:44595 udp
-> endpoint 167 flow 0x0 , identity 55362->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.244.1.218: 10.244.1.218:53 -> 10.244.1.157:44595 udp
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp SYN
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state new ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp SYN
-> endpoint 167 flow 0x37659fde , identity world->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.96.143.130: 10.96.143.130:80 -> 10.244.1.157:37122 tcp SYN, ACK
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK

-> endpoint 167 flow 0x37659fde , identity world->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.96.143.130: 10.96.143.130:80 -> 10.244.1.157:37122 tcp ACK
<- proxy flow 0xc9509ce0 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> endpoint 167 flow 0xc9509ce0 , identity 61418->61418 state new ifindex lxccb5c6c879ceb orig-ip 10.244.1.157: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
<- proxy flow 0xd2976989 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> endpoint 167 flow 0xd2976989 , identity 61418->61418 state established ifindex lxccb5c6c879ceb orig-ip 10.244.1.157: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
<- proxy flow 0xb26a780d , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> endpoint 167 flow 0xb26a780d , identity 61418->61418 state established ifindex lxccb5c6c879ceb orig-ip 10.244.1.157: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
<- proxy flow 0x7f4d1f2a , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> endpoint 167 flow 0x7f4d1f2a , identity 61418->61418 state established ifindex lxccb5c6c879ceb orig-ip 10.244.1.157: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
<- proxy flow 0xe13162b , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> endpoint 167 flow 0xe13162b , identity 61418->61418 state established ifindex lxccb5c6c879ceb orig-ip 10.244.1.157: 10.244.1.157:37122 -> 10.244.1.157:8080 tcp SYN
-> Request http from 167 ([k8s:app=echo k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:version=v1]) to 0 ([reserved:world]), identity 61418->2, verdict Forwarded GET http://echo/v1 => 0
-> endpoint 167 flow 0x37659fde , identity world->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.96.143.130: 10.96.143.130:80 -> 10.244.1.157:37122 tcp ACK
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
-> Response http to 167 ([k8s:app=echo k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:version=v1]) from 0 ([reserved:world]), identity 2->61418, verdict Forwarded GET http://echo/v1 => 503
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK, FIN
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK, FIN
-> endpoint 167 flow 0x37659fde , identity world->61418 state reply ifindex lxccb5c6c879ceb orig-ip 10.96.143.130: 10.96.143.130:80 -> 10.244.1.157:37122 tcp ACK, FIN
<- endpoint 167 flow 0xe8c894f7 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
-> proxy port 13874 flow 0xe8c894f7 , identity 61418->unknown state established ifindex 0 orig-ip 0.0.0.0: 10.244.1.157:37122 -> 10.96.143.130:80 tcp ACK
<- endpoint 167 flow 0x0 , identity 61418->unknown state unknown ifindex 0 orig-ip 0.0.0.0: fe80::c8cd:30ff:fed3:ddcf -> ff02::2 icmp RouterSolicitation
xx drop (Unsupported L3 protocol) flow 0x0 to endpoint 0, ifindex 21, file bpf_lxc.c:1530, , identity 61418->unknown: fe80::c8cd:30ff:fed3:ddcf -> ff02::2 icmp RouterSolicitation

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Labels

area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsarea/proxyImpacts proxy components, including DNS, Kafka, Envoy and/or XDS servers.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions