-
Notifications
You must be signed in to change notification settings - Fork 3.4k
datapath: Do not SNAT replies to outside #17168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
49b6f6b
to
6107b87
Compare
test-net-next |
76ad256
to
b9a4d2d
Compare
test-me-please |
test-1.19-5.4 |
test-1.20-4.19 |
1 similar comment
test-1.20-4.19 |
4.19 is hitting the complexity issue 👀 |
test-1.20-4.19 |
test-me-please |
12656d7
to
c6764f1
Compare
test-me-please |
c6764f1
to
b24f05b
Compare
test-me-please |
b24f05b
to
d567075
Compare
Previously, the BPF-based masquerading (--enable-bpf-masquerade=true) was wrongly masquerading replies from a pod to an outside when the outside had initiated a connection. This was possible when e.g., the outside had a route to the pod cidr. To fix this, we introduce a lightweight CT lookup function ct_is_reply4() which checks whether a given flow is a reply. The lookup function is called in snat_v4_needed(). As a side note, I've tried to move the port extraction to a separate function, but unfortunately it hits complexity issues on the 4.19 kernel in the "K8sDatapathConfig AutoDirectNodeRoutes Check direct connectivity with per endpoint routes" suite: BPF program is too large. Processed 131073 insn libbpf: failed to load program 'handle_to_container' libbpf: failed to load object '624_next/bpf_lxc.o' Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, they were failing due to our datapath masquerading replies from pod to outside. As it got fixed in the previous commit, we can enable BPF-based masquerading. This will also gives us some coverage for the fix. Signed-off-by: Martynas Pumputis <m@lambda.lt>
test-me-please Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment |
test-net-next |
CI has passed and we have reviewer approval, marking ready to merge. |
Previously, we required socketLB to be enabled in order for BPF masquerade to properly function. The reasoning was outlined in [1] and [2]. As pointed by Julian Wiedmann, [3] resolved the following NAT reply issue: On the remote node, the reply (dst=the client node IP) gets masqueraded by the BPF-masq feature, because we masquerade pod -> remote host IP in the tunnel mode (see comment in the "snat_v4_needed()" for the reason), and currently we don't consult the CT map to see whether a packet is reply. Thus, we can remove the check. [1]: #15437 [2]: 50e59c3 [3]: #17168 Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, we required socketLB to be enabled in order for BPF masquerade to properly function. The reasoning was outlined in [1] and [2]. As pointed by Julian Wiedmann, [3] resolved the following NAT reply issue: On the remote node, the reply (dst=the client node IP) gets masqueraded by the BPF-masq feature, because we masquerade pod -> remote host IP in the tunnel mode (see comment in the "snat_v4_needed()" for the reason), and currently we don't consult the CT map to see whether a packet is reply. Thus, we can remove the check. [1]: #15437 [2]: 50e59c3 [3]: #17168 Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, we required socketLB to be enabled in order for BPF masquerade to properly function. The reasoning was outlined in [1] and [2]. As pointed by Julian Wiedmann, [3] resolved the following NAT reply issue: On the remote node, the reply (dst=the client node IP) gets masqueraded by the BPF-masq feature, because we masquerade pod -> remote host IP in the tunnel mode (see comment in the "snat_v4_needed()" for the reason), and currently we don't consult the CT map to see whether a packet is reply. Thus, we can remove the check. [1]: #15437 [2]: 50e59c3 [3]: #17168 Signed-off-by: Martynas Pumputis <m@lambda.lt>
[ upstream commit 3de8537 ] Previously, we required socketLB to be enabled in order for BPF masquerade to properly function. The reasoning was outlined in [1] and [2]. As pointed by Julian Wiedmann, [3] resolved the following NAT reply issue: On the remote node, the reply (dst=the client node IP) gets masqueraded by the BPF-masq feature, because we masquerade pod -> remote host IP in the tunnel mode (see comment in the "snat_v4_needed()" for the reason), and currently we don't consult the CT map to see whether a packet is reply. Thus, we can remove the check. [1]: #15437 [2]: 50e59c3 [3]: #17168 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 3de8537 ] Previously, we required socketLB to be enabled in order for BPF masquerade to properly function. The reasoning was outlined in [1] and [2]. As pointed by Julian Wiedmann, [3] resolved the following NAT reply issue: On the remote node, the reply (dst=the client node IP) gets masqueraded by the BPF-masq feature, because we masquerade pod -> remote host IP in the tunnel mode (see comment in the "snat_v4_needed()" for the reason), and currently we don't consult the CT map to see whether a packet is reply. Thus, we can remove the check. [1]: #15437 [2]: 50e59c3 [3]: #17168 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Tam Mach <tam.mach@cilium.io>
See commit msgs.
Fix: #12544