Skip to content

Conversation

julianwiedmann
Copy link
Member

Add the missing bits in the IPv6 SNAT engine, so that ICMPV6_DEST_UNREACH packets (which we generate for a NodePort service without backends) can exit the node.

@julianwiedmann julianwiedmann added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/loadbalancing Impacts load-balancing and Kubernetes service implementations feature/snat Relates to SNAT or Masquerading of traffic labels Feb 20, 2025
@julianwiedmann julianwiedmann force-pushed the 1.18-bpf-icmp6-dest-unreach branch from 59362f9 to c88fcf9 Compare February 20, 2025 12:36
@julianwiedmann julianwiedmann changed the title 1.18 bpf icmp6 dest unreach bpf: nat: support ICMPV6_DEST_UNREACH in egress path Feb 20, 2025
@julianwiedmann
Copy link
Member Author

Alright, let's do this. I'm planning to add a bit more ICMPv6 support (basically establish equivalent to the ICMPv4 support), will look into adding some tests in that context to cover the actual rewrite scenarios.

@julianwiedmann julianwiedmann marked this pull request as ready for review February 20, 2025 12:39
@julianwiedmann julianwiedmann requested a review from a team as a code owner February 20, 2025 12:39
@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann julianwiedmann force-pushed the 1.18-bpf-icmp6-dest-unreach branch from c88fcf9 to 11d88f5 Compare February 21, 2025 13:14
@julianwiedmann
Copy link
Member Author

(de-duplicated the CHECK() stage. Still need to deal with some checksum fallout - our pkt generator currently doesn't fill the ICMPv6 checksum, and something's odd about the IPv4 checksum as well ...)

Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Though based on CI output it seems like the tests need some work.

Add the minimal bits to handle ICMPv6 error messages in the IPv6 NAT
egress path. This is basically a copy of the corresponding ICMPv4 code.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Slim down the inner function __tail_no_service_ipv*() sufficiently, so
that it contains *only* the packet-rewrite logic. This allows us to easily
reuse it from test context, to build identical packets.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Use the intended helper to be a bit more self-documenting.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
The LB path sends an ICMP error message when the accessed service has no
backend available (see cilium#28157).

In the case of a nodeport service, validate that this ICMP error message
can exit the node (and traverse the `to-netdev` program) when
BPF masquerading is enabled.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann julianwiedmann force-pushed the 1.18-bpf-icmp6-dest-unreach branch from 11d88f5 to 4cc8aa6 Compare February 26, 2025 10:23
@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann
Copy link
Member Author

julianwiedmann commented Feb 26, 2025

This looks good to me. Though based on CI output it seems like the tests need some work.

@dylandreimerink addressed the tests in a slightly different way, mind having another quick look? Shouldn't be controversial.

Long-term, I think for "multi-tests" (like here) we want a bit of helper logic to copy packets between a CHECK() and a subsequent PKTGEN() stage. This would reflect the test intention better, instead of the second PKTGEN() stage trying to re-construct the result of the first test.

I've refactored the test a bit, re-using the actual production logic to transform the request into an ICMP error message. This actually feels like a good approach in general, converting nodeport.h etc into a library of advanced packet transformation helpers that can be re-used outside of the production datapath.

@dylandreimerink
Copy link
Member

Long-term, I think for "multi-tests" (like here) we want a bit of helper logic to copy packets between a CHECK() and a subsequent PKTGEN() stage. This would reflect the test intention better, instead of the second PKTGEN() stage trying to re-construct the result of the first test.

Yea, we really should start looking into the next generation of BPF testing. We are at the limit of what is practical with the current framework (which is already much more then I ever thought possible).

I've refactored the test a bit, re-using the actual production logic to transform the request into an ICMP error message. This actually feels like a good approach in general, converting nodeport.h etc into a library of advanced packet transformation helpers that can be re-used outside of the production datapath.

Yes, I agree.

Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@julianwiedmann julianwiedmann added this pull request to the merge queue Feb 26, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 26, 2025
Merged via the queue into cilium:main with commit bbb2a87 Feb 26, 2025
64 of 65 checks passed
@julianwiedmann julianwiedmann deleted the 1.18-bpf-icmp6-dest-unreach branch February 26, 2025 12:50
@julianwiedmann julianwiedmann added backport/author The backport will be carried out by the author of the PR. needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Feb 26, 2025
@julianwiedmann julianwiedmann added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/loadbalancing Impacts load-balancing and Kubernetes service implementations backport/author The backport will be carried out by the author of the PR. backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. feature/snat Relates to SNAT or Masquerading of traffic ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants