-
Notifications
You must be signed in to change notification settings - Fork 3.4k
bpf: tunnel-related cleanups in to-container path #28920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
julianwiedmann
merged 5 commits into
cilium:main
from
julianwiedmann:1.15-bpf-local-delivery-overlay
Nov 7, 2023
Merged
bpf: tunnel-related cleanups in to-container path #28920
julianwiedmann
merged 5 commits into
cilium:main
from
julianwiedmann:1.15-bpf-local-delivery-overlay
Nov 7, 2023
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/test |
9cb148c
to
799a3af
Compare
/test |
…l-call We recently added a `from_tunnel` parameter to the local delivery path of IPv4 traffic, which gets passed to the pod's ingress tail-call via CB_FROM_TUNNEL. Add the same parameter for the IPv6 path, and then use it in bpf_lxc to fine-tune the kube-proxy workaround. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
To stay consistent with bpf_lxc, prefer the `from_tunnel` parameter over IS_BPF_OVERLAY to detect whether l3_local_delivery() was inlined by from-overlay. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This specific workaround claims to handle service replies that came in via tunnel, and require RevDNAT by kube-proxy before delivery to the local pod. But it's located in the to-container program (and so the next hop will be the veth peer, *not* the kernel stack). And l3_local_delivery() in from-overlay will have already applied the same workaround, before passing the packet to the stack. Thus it's safe to remove the stale code in bpf_lxc. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
799a3af
to
22ae0a4
Compare
/test |
Wrap the access to CB_FROM_TUNNEL in HAVE_ENCAP, so that it can be easily optimized out when there's no tunnel traffic in the cluster. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Instead of inflicting this code path on every user of redirect_ep(), limit it to actual tunnel traffic. This is possible as we now have the `from_tunnel` parameter in bpf_lxc's policy tail-call function. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
22ae0a4
to
44ca7c7
Compare
/test |
pchaigno
approved these changes
Nov 6, 2023
ldelossa
approved these changes
Nov 6, 2023
julianwiedmann
added a commit
to julianwiedmann/cilium
that referenced
this pull request
Nov 21, 2023
cilium#28920 introduced usage of CB_FROM_TUNNEL in the IPv6 ingress tail-call. But unless all calling programs get upgraded/downgraded in *very* specific order (so that while this version of the tail-call is installed, *all* in-flight skbs have their CB_FROM_TUNNEL populated), some packets potentially miss the tunnel-specific processing that tail_ipv6_policy() is meant to apply. In particular their type might not get switched to PACKET_HOST, resulting in drops with SKB_DROP_REASON_OTHERHOST. So we need to take a two-step approach - first roll out the support for CB_FROM_TUNNEL into all callers, but wait until a subsequent release (1.16) before tail_ipv6_policy() starts using it. Thus even on a downgrade to 1.15, all callers will continue to populate CB_FROM_TUNNEL. Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
brb
pushed a commit
that referenced
this pull request
Nov 21, 2023
#28920 introduced usage of CB_FROM_TUNNEL in the IPv6 ingress tail-call. But unless all calling programs get upgraded/downgraded in *very* specific order (so that while this version of the tail-call is installed, *all* in-flight skbs have their CB_FROM_TUNNEL populated), some packets potentially miss the tunnel-specific processing that tail_ipv6_policy() is meant to apply. In particular their type might not get switched to PACKET_HOST, resulting in drops with SKB_DROP_REASON_OTHERHOST. So we need to take a two-step approach - first roll out the support for CB_FROM_TUNNEL into all callers, but wait until a subsequent release (1.16) before tail_ipv6_policy() starts using it. Thus even on a downgrade to 1.15, all callers will continue to populate CB_FROM_TUNNEL. Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
brb
pushed a commit
that referenced
this pull request
Nov 22, 2023
#28920 introduced usage of CB_FROM_TUNNEL in the IPv6 ingress tail-call. But unless all calling programs get upgraded/downgraded in *very* specific order (so that while this version of the tail-call is installed, *all* in-flight skbs have their CB_FROM_TUNNEL populated), some packets potentially miss the tunnel-specific processing that tail_ipv6_policy() is meant to apply. In particular their type might not get switched to PACKET_HOST, resulting in drops with SKB_DROP_REASON_OTHERHOST. So we need to take a two-step approach - first roll out the support for CB_FROM_TUNNEL into all callers, but wait until a subsequent release (1.16) before tail_ipv6_policy() starts using it. Thus even on a downgrade to 1.15, all callers will continue to populate CB_FROM_TUNNEL. Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
pjablonski123
pushed a commit
to pjablonski123/cilium
that referenced
this pull request
Dec 15, 2023
cilium#28920 introduced usage of CB_FROM_TUNNEL in the IPv6 ingress tail-call. But unless all calling programs get upgraded/downgraded in *very* specific order (so that while this version of the tail-call is installed, *all* in-flight skbs have their CB_FROM_TUNNEL populated), some packets potentially miss the tunnel-specific processing that tail_ipv6_policy() is meant to apply. In particular their type might not get switched to PACKET_HOST, resulting in drops with SKB_DROP_REASON_OTHERHOST. So we need to take a two-step approach - first roll out the support for CB_FROM_TUNNEL into all callers, but wait until a subsequent release (1.16) before tail_ipv6_policy() starts using it. Thus even on a downgrade to 1.15, all callers will continue to populate CB_FROM_TUNNEL. Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/datapath
Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
area/kube-proxy
Issues related to kube-proxy (not the kube-proxy-free mode).
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-note/misc
This PR makes changes that have no direct user impact.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We've accumulated a bunch of special handling and kube-proxy workaround related to tunnel traffic. With the recent introduction of a
from_tunnel
parameter for bpx_lxc's ingress tail-call, it's now easier to fine-tune these code paths and reduce their scope.