Skip to content

rp_filter (default) strict mode breaks certain load balancing cases in kube-proxy-free mode #13130

@aditighag

Description

@aditighag

In a multi-node environment, an external incoming request (N-S) that's destined to a nodeport service IP, when arrives on a node that's running one of the service backends, is dropped by the rp_filter strict mode.

When a request arrives on the node where a backend is running, the destination address is translated to one of the locally running backend pods, which is then dropped by rp_filter strict mode. The strict mode checks for a reverse path for an incoming packet against FIB table, which would be an lxc-xxx interface when an endpoint route exists for the backend pod or a cilium_host interface otherwise. In such cases, the translated packets are dropped.

You can monitor the drops using logs/stats -

nstat -rsz | grep IPReversePathFilter

sysctl -w 'net.ipv4.conf.all.log_martians=1
tail -fn0  /var/log/messages
:
Sep  2 23:57:53 ip-192-168-55-95 kernel: IPv4: martian source 192.168.156.88 from 192.168.2.138, on dev eth0

ip route get 192.168.156.88 from 192.168.2.138 iif eth0
RTNETLINK answers: Invalid cross-device link

The potential fix is to set the rp_filter mode to Loose mode [1] in the cilium agent that allows for asymmetric routing cases such as the cilium case. However, this might not help the environments, where systemd overrides such settings after cilium starts [2].

This was discussed in the sig-datapath meeting on 09/09 [3].

[1] https://tools.ietf.org/html/rfc3704#page-6
[2] #10645
[3] https://docs.google.com/document/d/1IZZ-FTM_U3zN06-M_mDOgep_DcDM0Oe3YOFVM1BTIt8/edit#heading=h.39952dxvg45a

Metadata

Metadata

Assignees

Labels

area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions