-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Labels
area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsfeature/ipv6Relates to IPv6 protocol supportRelates to IPv6 protocol supportfeature/socket-lbImpacts the Socket-LB part of Cilium's kube-proxy replacement.Impacts the Socket-LB part of Cilium's kube-proxy replacement.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.15.16 and lower than v1.16.0
What happened?
We have the following setup:
- A server (statsd) that binds an ipv4 udp socket on
0.0.0.0:8126
- A
SingleStack
service created in k8s for this server - A client that talks to the service using the v4-mapped-on-v6 address type. I.e. the client creates an
AF_INET6
socket, and then uses the::ffff:
prefix to talk to the service, e.g.::ffff:10.2.3.4
.
This all works fine until the statsd server restarts. Once restarted, the client continues to send udp packets to the old statsd pod ip address.
This seems very similar to other issues around socket closure that have been reported, e.g.:
- Stale UDP connections from NGINX to CoreDNS on fargate with kube-proxy replacement #35773
- Long lived unidirectional UDP connections trough a service failing to reach new endpoints #37577
We have the following relevant cilium options set (using cilium version v1.16.0
):
bpf-lb-sock: "true"
bpf-lb-sock-hostns-only: "false"
bpf-lb-sock-terminate-pod-connections: "true"
Some interesting things we've observed:
- With
bpf-lb-sock
andbpf-lb-sock-terminate-pod-connections
set, if the client uses an ipv4AF_INET
socket instead, sockets get forcibly closed / terminated as expected. - If both the server, service, and client use ipv6, they are not closed as expected.
Essentially it seems like the stale-socket-finding-and-destruction business is working correctly for ipv4, but not ipv6.
How can we reproduce the issue?
- Bind a
SOCK_DGRAM
AF_INET
socket in a server. - Create a corresponding kubernetes
Service
for this server. - In a client, create a
SOCK_DGRAM
AF_INET6
socket. Configure the address to use a v4-mapped-on-v6 address type. E.g. if the kubernetes service address is10.2.3.4
, configure the address the client will connect to as::ffff:10.2.3.4
. - Use the
connect() + send()
syscalls to create a long-lived udp socket, and send packets at some interval (e.g. every 10 seconds) from the client to the server. Note: usingsendto()
, i.e. a short-lived socket effectively works around the issue, and does not produce the undesired behavior. - Restart the server
- Notice how the packets from the client are still sent to the old server ip address.
Cilium Version
v1.16.0
Kernel Version
Linux 5.4.0-204-generic #224-Ubuntu SMP Thu Dec 5 13:38:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
v1.30.6
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.Impacts bpf/ or low-level forwarding details, including map management and monitor messages.area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsfeature/ipv6Relates to IPv6 protocol supportRelates to IPv6 protocol supportfeature/socket-lbImpacts the Socket-LB part of Cilium's kube-proxy replacement.Impacts the Socket-LB part of Cilium's kube-proxy replacement.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.