Skip to content

Can't connect to remote NodePort backend in hostNetwork (with tunnel enabled) #22557

@michaelasp

Description

@michaelasp

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

In cilium v1.12 with tunneling enabled, it is not possible to send traffic to nodeport services with hostnetwork pods being enabled. I believe the issue stems from this pr #18815. The function DirectRoutingDeviceRequired returns false in this case and never sets the macro IPV4_DIRECT_ROUTING which is required by nodeport for the SNAT IP for pods which aren't tunneled(i.e host network pods). Thanks for looking at this issue, it took us a bit to trace down but hopefully it's detailed enough. I think the macro still needs to be set in this case for traffic that is not tunneled.

The datapath code that uses this is here,

cilium/bpf/lib/nat.h

Lines 599 to 613 in 0759290

#if defined(TUNNEL_MODE) && defined(IS_BPF_OVERLAY)
if (ip4->saddr == IPV4_GATEWAY) {
target->addr = IPV4_GATEWAY;
return true;
}
#else
/* NATIVE_DEV_IFINDEX == DIRECT_ROUTING_DEV_IFINDEX cannot be moved into
* preprocessor, as the former is known only during load time (templating).
* This checks whether bpf_host is running on the direct routing device.
*/
if (DIRECT_ROUTING_DEV_IFINDEX == NATIVE_DEV_IFINDEX &&
ip4->saddr == IPV4_DIRECT_ROUTING) {
target->addr = IPV4_DIRECT_ROUTING;
return true;
}

I believe that HostNetwork traffic falls into the second part of the conditional which breaks traffic flow.

Cilium Version

v1.12.4

Kernel Version

5.4.0-42-generic

Kubernetes Version

1.25

Sysdump

N/A

Relevant log output

No response

Anything else?

TCP Dump:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vxlan0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:22:09.271839 IP 10.200.0.2.47792 > 10.200.0.52.29442: Flags [S], seq 2615184186, win 64390, options [mss 1370,sackOK,TS val 2929822314 ecr 0,nop,wscale 7], length 0
22:22:09.271923 IP 0.0.0.0.47792 > 10.200.0.5.29442: Flags [S], seq 2615184186, win 64390, options [mss 1370,sackOK,TS val 2929822314 ecr 0,nop,wscale 7], length 0

SNAT IP is replaced by 0.0.0.0 since IPV4_DIRECT_ROUTING is not set.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/loadbalancingImpacts load-balancing and Kubernetes service implementationskind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions