Skip to content

Conversation

jschwinger233
Copy link
Member

@jschwinger233 jschwinger233 commented Jan 4, 2024

See commit messages for details.

Fixes proxy issues in egress direction

Once forward traffic for an egress proxy connection has traversed through
cilium_host / cilium_net, we expect IPsec-marked packets to get handled
by xfrm. But this currently conflicts with an iptables rule for the
proxy's transparent socket, which then over-writes the mark:

    -A CILIUM_PRE_mangle -m socket --transparent -m comment --comment "cilium: any->pod redirect proxied traffic to host proxy" -j MARK --set-xmark 0x200/0xffffffff

We can avoid this by adding an extra filter to this rule, so that it
doesn't match IPsec-marked packets.

Signed-off-by: Zhichuan Liang<gray.liang@isovalent.com>
After forward traffic for an egress proxy onnection has traversed through
cilium_host / cilium_net, we expect IPsec-marked packets to get handled
by xfrm.

This currently conflicts with early demux, which matches the connection's
transparent socket and assigns it to the packet:

```
// https://elixir.bootlin.com/linux/v6.2/source/net/ipv4/tcp_ipv4.c#L1770
int tcp_v4_early_demux(struct sk_buff *skb)
{
...
	sk = __inet_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
				       iph->saddr, th->source,
				       iph->daddr, ntohs(th->dest),
				       skb->skb_iif, inet_sdif(skb));
	if (sk) {
		skb->sk = sk;
...
}
```

It then gets dropped in ip_forward(), before reaching xfrm:

```
// https://elixir.bootlin.com/linux/v6.2/source/net/ipv4/ip_forward.c#L100
int ip_forward(struct sk_buff *skb)
{
...
    if (unlikely(skb->sk))
		goto drop;
...
}
```

To avoid this we disable early-demux in a L7 + IPsec config.

Note that the L7 proxy feature needs to deal with similar troubles, as the
comment for inboundProxyRedirectRule() describes. Ideally we would build
a similar solution for IPsec, diverting traffic with policy routing so that
it doesn't get intercepted by early-demux.

Signed-off-by: Zhichuan Liang<gray.liang@isovalent.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
The from-host path already knows how to handle traffic that comes from
the ingress proxy. Extend this logic to also cover traffic that originates
from the egress proxy.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 4, 2024
@jschwinger233
Copy link
Member Author

/test

@julianwiedmann julianwiedmann self-requested a review January 4, 2024 07:20
@jschwinger233 jschwinger233 marked this pull request as ready for review January 4, 2024 07:25
@jschwinger233 jschwinger233 requested a review from a team as a code owner January 4, 2024 07:25
@jschwinger233 jschwinger233 added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jan 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 4, 2024
@jschwinger233 jschwinger233 added backport/author The backport will be carried out by the author of the PR. needs-backport/1.12 area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Jan 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 4, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Jan 4, 2024
Merged via the queue into cilium:main with commit e96e9cd Jan 4, 2024
@julianwiedmann julianwiedmann mentioned this pull request Mar 5, 2024
3 tasks
@julianwiedmann julianwiedmann added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 labels Mar 5, 2024
@julianwiedmann julianwiedmann mentioned this pull request Mar 5, 2024
6 tasks
@julianwiedmann julianwiedmann added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 labels Mar 5, 2024
@julianwiedmann julianwiedmann mentioned this pull request Mar 5, 2024
7 tasks
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. backport-pending/1.13 labels Mar 6, 2024
@jrajahalme
Copy link
Member

@jschwinger233 I wonder if the socket mark condition here could be changed that it only applies to the forward path, and not to the return packet path? This change has caused issues with proxy return packets not getting to the proxy, likely due to the encryption mark being left on the packet after the return packets have been decrypted. This has caused issues like this: #27762 (comment). Currently fixing these by making sure the CT entry exists and has the ProxyRedirect flag set (e.g., #32614).

So in summary I see two possible ways to remedy this:

  1. make sure the encryption marks are cleared after decryption of the return packets (preferable, IMO)
  2. modify the conditions on the socket match rule so that they only apply to the forward path packets

@jschwinger233
Copy link
Member Author

@jrajahalme Sorry but I'm not sure how this PR causes #27762 which doesn't enable IPsec. Most changes this PR made is under the guard of #if defined(ENABLE_IPSEC), expect for magic = MARK_MAGIC_PROXY_EGRESS, which only makes a difference for local_delivery.

Do you think it's possible that #32367 has something to do with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport/author The backport will be carried out by the author of the PR. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
Status: Released
Status: Released
Status: Released
Development

Successfully merging this pull request may close these issues.

3 participants