Skip to content

Conversation

brb
Copy link
Member

@brb brb commented Jun 27, 2024

Previously, the redirect from L3 netdev (in nodeport.h) to Cilium's tunnel device failed due to the bpf_redirect returning -ERANGE [1].

Further investigation revealed the following values for the troublesome skb:

        skb->len = 60
        skb->data = 0x000000009566a77e
        skb->head = 0x000000001a70ff88
        skb->network_header = 64

"skb->data" was set to a weird location when compared to "skb->head", and thus made the check to fail.

Adding the L2 hdr before redirecting to the tunnel device resolved the issue.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/core/filter.c?h=v6.7.4#n2147

Fix #26847

cc @jspaleta

@brb brb added kind/bug This is a bug in the Cilium logic. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. needs-backport/1.14 labels Jun 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jun 27, 2024
@brb brb force-pushed the pr/brb/fix-np-redir-l3-to-tunnel branch from e9951ca to 9e852ad Compare June 27, 2024 10:12
@brb brb added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Jun 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 27, 2024
@brb brb force-pushed the pr/brb/fix-np-redir-l3-to-tunnel branch from 9e852ad to 1c35824 Compare June 27, 2024 10:19
Previously, the redirect from L3 netdev (in nodeport.h) to Cilium's
tunnel device failed due to the bpf_redirect returning -ERANGE [1].

Further investigation revealed the following values for the troublesome
skb:

    skb->len = 60
    skb->data = 0x000000009566a77e
    skb->head = 0x000000001a70ff88
    skb->network_header = 64

"skb->data" was set to a weird location when compared to "skb->head",
and thus made the check to fail.

Adding the L2 hdr before redirecting to the tunnel device resolved the
issue.

P.S., I wanted to use "if (__ctx_is == __ctx_skb)" instead of the #if.
Unfortunately, then the bpf_xdp.c fails to compile with:

    error: use of undeclared identifier 'ENCAP_IFINDEX'

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/core/filter.c?h=v6.7.4#
n2147

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb brb force-pushed the pr/brb/fix-np-redir-l3-to-tunnel branch from 1c35824 to a4b5ee7 Compare June 27, 2024 10:30
To test the fix in the previous commit.

Ideally, we would want to test against running L3 netdev and vxlan
netdev with Cilium's BPF progs loaded. Unfortunately, we don't have an
infrastructure for such a test. Thus, only check that the L2 hdr is
appended before the redirect.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb brb force-pushed the pr/brb/fix-np-redir-l3-to-tunnel branch from a4b5ee7 to 4159ad9 Compare June 27, 2024 10:33
@brb
Copy link
Member Author

brb commented Jun 27, 2024

/test

@brb brb marked this pull request as ready for review June 27, 2024 10:42
@brb brb requested a review from a team as a code owner June 27, 2024 10:42
@brb brb requested a review from markpash June 27, 2024 10:42
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 27, 2024
@brb brb added this pull request to the merge queue Jun 27, 2024
Merged via the queue into main with commit aeb5f5d Jun 27, 2024
@brb brb deleted the pr/brb/fix-np-redir-l3-to-tunnel branch June 27, 2024 18:04
@julianwiedmann
Copy link
Member

@brb do you have any smart ideas how to wire up L3 device testing into CI ? I feel we're playing whack-a-mole here :/. Surely the other users of nodeport_add_tunnel_encap() in nodeport.h would need a similar fix?

@brb
Copy link
Member Author

brb commented Jul 1, 2024

For this PR I experimented with integration tests in which we attach BPF progs (to L3 netdev and tunnel netdev) from the partly mocked agent, and then inject with libpcap a packet. Unfortunately, the former turned to be not that straight forward task, so I abandoned it due to time constraints. Nevertheless, I'd like to chat with folks (foundations / loader / agent) about adding such tests (maybe at the hive time?).

Another approach would be to extend Kind setup in ci-e2e by adding L3 netdev. This feels like doable in a reasonable amount of time. Let me create an issue for it.

@pippolo84 pippolo84 added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 labels Jul 2, 2024
@pippolo84 pippolo84 mentioned this pull request Jul 2, 2024
7 tasks
@pippolo84 pippolo84 added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 labels Jul 2, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Jul 8, 2024
@julianwiedmann julianwiedmann added backport/author The backport will be carried out by the author of the PR. needs-backport/1.14 and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Jul 24, 2024
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request Oct 2, 2024
When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

cilium#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request Oct 2, 2024
When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

cilium#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann julianwiedmann added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 labels Oct 7, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Oct 7, 2024
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request Oct 8, 2024
When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

cilium#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 13, 2024
When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
tklauser pushed a commit that referenced this pull request Oct 22, 2024
[ upstream commit abdaddc ]

When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
github-merge-queue bot pushed a commit that referenced this pull request Oct 22, 2024
[ upstream commit abdaddc ]

When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
rastislavs pushed a commit that referenced this pull request Oct 28, 2024
[ upstream commit abdaddc ]

When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
julianwiedmann added a commit that referenced this pull request Oct 29, 2024
[ upstream commit abdaddc ]

When redirecting from a L3 device to the overlay interface, we need to
manually add a L2 header to the (inner) packet.

#33421 fixed this for the case of
Nodeport NAT traffic from the LB node to a backend. Generalize it so
that it helps all users of the nodeport_add_tunnel_encap() helper - for
example DSR-Geneve or EgressGW reply traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. backport/author The backport will be carried out by the author of the PR. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accessing ClusterIP services doesn't work via a Tailscale pod
4 participants