Skip to content

Conversation

julianwiedmann
Copy link
Member

In the case of ICMP ECHO packets, the port is really just an arbitrary 16-bit identifier. We can't skip SNAT based on what value this identifier has.

For completeness also do this change for ECHO_REPLY packets, even though it won't make a difference there. It will be needed once we stop clamping the ICMP identifier NAT to the nat_target's min_port / max_port range.

In the case of ICMP ECHO packets, the `port` is really just an arbitrary
16-bit identifier. We can't skip SNAT based on what value this identifier
has.

For completeness also do this change for ECHO_REPLY packets, even though
it won't make a difference there. It will be needed once we stop clamping
the ICMP identifier NAT to the nat_target's min_port / max_port range.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann julianwiedmann added kind/bug This is a bug in the Cilium logic. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. feature/snat Relates to SNAT or Masquerading of traffic severity/low Impact is limited, only affecting unusual configs, or all configs with low impact in prod. labels May 14, 2025
@julianwiedmann
Copy link
Member Author

Looking at how this code works in detail, I believe we currently would only see issues if multiple host processes (with saddr == IPV4_MASQUERADE) are concurrently sending pings to the same destination, using the same identifier value.
Here skipping the NAT means that we can't resolve the identifier re-use, and can't differentiate the ECHO_REPLY packets.

This doesn't feel like a real-world problem, hence not backporting :).

@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann julianwiedmann marked this pull request as ready for review May 14, 2025 06:20
@julianwiedmann julianwiedmann requested a review from a team as a code owner May 14, 2025 06:20
@julianwiedmann julianwiedmann enabled auto-merge May 14, 2025 06:28
Copy link
Member

@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratched

@julianwiedmann julianwiedmann added this pull request to the merge queue May 14, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 14, 2025
@YutaroHayakawa YutaroHayakawa removed this pull request from the merge queue due to a manual request May 14, 2025
Copy link
Member

@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Just to confirm, with this change, we skip some of the logics in the snat_v4/v6_nat_can_skip other than the port range check. The if (target->egress_gateway) part and !target->from_local_endpoint part. Are we sure we also skip them?

static __always_inline bool
snat_v4_nat_can_skip(const struct ipv4_nat_target *target,
		     const struct ipv4_ct_tuple *tuple)
{
	__u16 sport = bpf_ntohs(tuple->sport);

#if defined(ENABLE_EGRESS_GATEWAY_COMMON) && defined(IS_BPF_HOST)
	if (target->egress_gateway)
		return false;
#endif

	return (!target->from_local_endpoint && sport < NAT_MIN_EGRESS);
}

@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 14, 2025
@julianwiedmann
Copy link
Member Author

julianwiedmann commented May 15, 2025

Just to confirm, with this change, we skip some of the logics in the snat_v4/v6_nat_can_skip other than the port range check. The if (target->egress_gateway) part and !target->from_local_endpoint part. Are we sure we also skip them?

That's correct. We care about conditions where snat_v*_nat_can_skip() would still return true. But if we agree that the sport < NAT_MIN_EGRESS condition is not valid for ICMP_ECHO, then the function essentially turns into what's sketched out below - and thus always returns false for ICMP_ECHO. So we don't even need to call it.

static __always_inline bool
snat_v4_nat_can_skip(const struct ipv4_nat_target *target,
		     const struct ipv4_ct_tuple *tuple)
{
	__u16 sport = bpf_ntohs(tuple->sport);

#if defined(ENABLE_EGRESS_GATEWAY_COMMON) && defined(IS_BPF_HOST)
	if (target->egress_gateway)
		return false;
#endif

        if (ICMP_ECHO)
	   return (!target->from_local_endpoint && false);
 
	return (!target->from_local_endpoint && sport < NAT_MIN_EGRESS);
}

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 15, 2025
@YutaroHayakawa YutaroHayakawa added this pull request to the merge queue May 15, 2025
Merged via the queue into cilium:main with commit f5d2906 May 15, 2025
74 checks passed
@julianwiedmann julianwiedmann deleted the 1.17-bpf-nat-icmp branch May 16, 2025 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/snat Relates to SNAT or Masquerading of traffic kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. severity/low Impact is limited, only affecting unusual configs, or all configs with low impact in prod.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants