-
Notifications
You must be signed in to change notification settings - Fork 3.4k
datapath: Fix redirect from from L3 netdev to tunnel #33421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e9951ca
to
9e852ad
Compare
9e852ad
to
1c35824
Compare
Previously, the redirect from L3 netdev (in nodeport.h) to Cilium's tunnel device failed due to the bpf_redirect returning -ERANGE [1]. Further investigation revealed the following values for the troublesome skb: skb->len = 60 skb->data = 0x000000009566a77e skb->head = 0x000000001a70ff88 skb->network_header = 64 "skb->data" was set to a weird location when compared to "skb->head", and thus made the check to fail. Adding the L2 hdr before redirecting to the tunnel device resolved the issue. P.S., I wanted to use "if (__ctx_is == __ctx_skb)" instead of the #if. Unfortunately, then the bpf_xdp.c fails to compile with: error: use of undeclared identifier 'ENCAP_IFINDEX' [1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/core/filter.c?h=v6.7.4# n2147 Signed-off-by: Martynas Pumputis <m@lambda.lt>
1c35824
to
a4b5ee7
Compare
To test the fix in the previous commit. Ideally, we would want to test against running L3 netdev and vxlan netdev with Cilium's BPF progs loaded. Unfortunately, we don't have an infrastructure for such a test. Thus, only check that the L2 hdr is appended before the redirect. Signed-off-by: Martynas Pumputis <m@lambda.lt>
a4b5ee7
to
4159ad9
Compare
/test |
@brb do you have any smart ideas how to wire up L3 device testing into CI ? I feel we're playing whack-a-mole here :/. Surely the other users of |
For this PR I experimented with integration tests in which we attach BPF progs (to L3 netdev and tunnel netdev) from the partly mocked agent, and then inject with libpcap a packet. Unfortunately, the former turned to be not that straight forward task, so I abandoned it due to time constraints. Nevertheless, I'd like to chat with folks (foundations / loader / agent) about adding such tests (maybe at the hive time?). Another approach would be to extend Kind setup in ci-e2e by adding L3 netdev. This feels like doable in a reasonable amount of time. Let me create an issue for it. |
When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. cilium#33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. cilium#33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. cilium#33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. #33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit abdaddc ] When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. #33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit abdaddc ] When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. #33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit abdaddc ] When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. #33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
[ upstream commit abdaddc ] When redirecting from a L3 device to the overlay interface, we need to manually add a L2 header to the (inner) packet. #33421 fixed this for the case of Nodeport NAT traffic from the LB node to a backend. Generalize it so that it helps all users of the nodeport_add_tunnel_encap() helper - for example DSR-Geneve or EgressGW reply traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
Previously, the redirect from L3 netdev (in nodeport.h) to Cilium's tunnel device failed due to the bpf_redirect returning -ERANGE [1].
Further investigation revealed the following values for the troublesome skb:
"skb->data" was set to a weird location when compared to "skb->head", and thus made the check to fail.
Adding the L2 hdr before redirecting to the tunnel device resolved the issue.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/core/filter.c?h=v6.7.4#n2147
Fix #26847
cc @jspaleta