Skip to content

Accessing ClusterIP services doesn't work via a Tailscale pod #26847

@wokalski

Description

@wokalski

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I am running Tailscale (a wireguard-based VPN) inside a pod with NET_ADMIN capabilities.
I have tested two scenarios:

  1. Accessing a pod via its IP
  2. Accessing a ClusterIP service

Accessing the pod works, but accessing the Cluster IP doesn't.

The traffic directed to the Cluster IP is also not visible in Hubble when I tried observing for it. Instead, the packets are forwarded to the default gateway (and lost there).

Tailscale installs the following iptables rules:

*nat
:PREROUTING ACCEPT [18:1296]
:INPUT ACCEPT [18:1296]
:OUTPUT ACCEPT [196:15731]
:POSTROUTING ACCEPT [196:15731]
:ts-postrouting - [0:0]
-A PREROUTING -d 100.108.21.117/32 -j DNAT --to-destination 10.43.159.168
-A POSTROUTING -j ts-postrouting
-A ts-postrouting -m mark --mark 0x40000/0xff0000 -j MASQUERADE
COMMIT
# Completed on Sat Jul 15 12:30:20 2023
# Generated by iptables-save v1.8.8 on Sat Jul 15 12:30:20 2023
*filter
:INPUT ACCEPT [799:96468]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [814:92651]
:ts-forward - [0:0]
:ts-input - [0:0]
-A INPUT -j ts-input
-A FORWARD -j ts-forward
-A ts-forward -i tailscale0 -j MARK --set-xmark 0x40000/0xff0000
-A ts-forward -m mark --mark 0x40000/0xff0000 -j ACCEPT
-A ts-forward -s 100.64.0.0/10 -o tailscale0 -j DROP
-A ts-forward -o tailscale0 -j ACCEPT
-A ts-input -s 100.108.21.117/32 -i lo -j ACCEPT
-A ts-input -s 100.115.92.0/23 ! -i tailscale0 -j RETURN
-A ts-input -s 100.64.0.0/10 ! -i tailscale0 -j DROP
COMMIT
# Completed on Sat Jul 15 12:30:20 2023

On the host I can see the traffic in tcpdump as going from the tailscale pod's IP to the service IP. Unfortunately, it's routed to the default gateway (a router) that doesn't know what to do with it and drops it.

I have found a related thread on Tailscale's forum and one of the users suggested that enabling lbExternalClusterIP solved the issue for them. However it didn't do anything in my case. I am confused because accessing the ClusterIP from the physical node works as well (as you are probably well aware, I wasn't).

I don't understand why cilium doesn't seem to "pick up" this traffic.

Cilium Version

Client: 1.12.0 9447cd1 2022-07-19T12:22:00+02:00 go version go1.18.4 linux/amd64
Daemon: 1.12.0 9447cd1 2022-07-19T12:22:00+02:00 go version go1.18.4 linux/amd64

Kernel Version

Linux minio5 5.15.117-flatcar #1 SMP Tue Jul 4 14:43:38 -00 2023 x86_64 Intel(R) Xeon(R) CPU E5-2440 0 @ 2.40GHz GenuineIntel GNU/Linux

Kubernetes Version

v1.23.8+k3s2

Sysdump

https://drive.google.com/file/d/1GYxkeaGzXHIOuOM9DYFDLHFgUUXUypA3/view?usp=sharing

Relevant log output

No response

Anything else?

This exact setup works on another cluster but that "cluster" is just a single node k3s deployment. On that cluster I can access Cluster IPs via tailscale (same cilium version, same config).

https://forum.tailscale.com/t/tailscale-proxy-in-k8s-with-cilium-works-with-pod-not-with-svc/1910/8
seemingly related:
#26584

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.info-completedThe GH issue has received a reply from the authorkind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions