Skip to content

datapath: Do not SNAT replies to outside #17168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 10, 2021
Merged

datapath: Do not SNAT replies to outside #17168

merged 3 commits into from
Sep 10, 2021

Conversation

brb
Copy link
Member

@brb brb commented Aug 16, 2021

See commit msgs.

Fix: #12544

@brb brb added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Aug 16, 2021
@brb brb force-pushed the pr/brb/ct-snat-bpf branch 2 times, most recently from 49b6f6b to 6107b87 Compare August 16, 2021 12:03
@brb
Copy link
Member Author

brb commented Aug 16, 2021

test-net-next

@brb brb force-pushed the pr/brb/ct-snat-bpf branch 3 times, most recently from 76ad256 to b9a4d2d Compare August 16, 2021 13:42
@brb brb added needs-backport/1.10 area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Aug 16, 2021
@brb brb changed the title datapath: Do not SNAT replies datapath: Do not SNAT replies to outside Aug 16, 2021
@brb
Copy link
Member Author

brb commented Aug 16, 2021

test-me-please

@brb
Copy link
Member Author

brb commented Aug 16, 2021

test-1.19-5.4

@brb
Copy link
Member Author

brb commented Aug 16, 2021

test-1.20-4.19

1 similar comment
@brb
Copy link
Member Author

brb commented Aug 16, 2021

test-1.20-4.19

@brb
Copy link
Member Author

brb commented Aug 17, 2021

4.19 is hitting the complexity issue 👀

@brb
Copy link
Member Author

brb commented Aug 18, 2021

test-1.20-4.19

@brb
Copy link
Member Author

brb commented Aug 18, 2021

test-me-please

@brb brb force-pushed the pr/brb/ct-snat-bpf branch from 12656d7 to c6764f1 Compare August 23, 2021 17:20
@brb
Copy link
Member Author

brb commented Aug 23, 2021

test-me-please

@brb brb marked this pull request as ready for review August 23, 2021 17:20
@brb brb requested a review from a team August 23, 2021 17:20
@brb brb requested a review from a team as a code owner August 23, 2021 17:20
@brb brb requested review from a team, kkourt and nebril August 23, 2021 17:20
@brb brb requested a review from borkmann August 23, 2021 17:26
@pchaigno pchaigno added the dont-merge/bad-bot To prevent MLH from marking ready-to-merge. label Sep 6, 2021
@brb brb force-pushed the pr/brb/ct-snat-bpf branch from c6764f1 to b24f05b Compare September 7, 2021 07:46
@brb
Copy link
Member Author

brb commented Sep 7, 2021

test-me-please

@brb brb force-pushed the pr/brb/ct-snat-bpf branch from b24f05b to d567075 Compare September 7, 2021 08:11
brb added 3 commits September 7, 2021 10:14
Previously, the BPF-based masquerading (--enable-bpf-masquerade=true)
was wrongly masquerading replies from a pod to an outside when the
outside had initiated a connection. This was possible when e.g., the
outside had a route to the pod cidr.

To fix this, we introduce a lightweight CT lookup function
ct_is_reply4() which checks whether a given flow is a reply. The lookup
function is called in snat_v4_needed().

As a side note, I've tried to move the port extraction to a separate
function, but unfortunately it hits complexity issues on the 4.19
kernel in the "K8sDatapathConfig AutoDirectNodeRoutes Check direct
connectivity with per endpoint routes" suite:

    BPF program is too large. Processed 131073 insn
    libbpf: failed to load program 'handle_to_container'
    libbpf: failed to load object '624_next/bpf_lxc.o'

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, they were failing due to our datapath masquerading replies
from pod to outside. As it got fixed in the previous commit, we can
enable BPF-based masquerading.

This will also gives us some coverage for the fix.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb
Copy link
Member Author

brb commented Sep 7, 2021

test-me-please

Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDemosTest Tests Star Wars Demo

Failure Output

FAIL: DNS entry is not ready after timeout

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.16-net-next so I can create a new GitHub issue to track it.

@brb
Copy link
Member Author

brb commented Sep 8, 2021

test-net-next

@christarazi christarazi removed dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. dont-merge/bad-bot To prevent MLH from marking ready-to-merge. labels Sep 8, 2021
@christarazi
Copy link
Member

CI has passed and we have reviewer approval, marking ready to merge.

@christarazi christarazi added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 8, 2021
@kaworu kaworu merged commit 4b92d2d into master Sep 10, 2021
@kaworu kaworu deleted the pr/brb/ct-snat-bpf branch September 10, 2021 15:17
brb added a commit that referenced this pull request Jul 11, 2024
Previously, we required socketLB to be enabled in order for BPF
masquerade to properly function. The reasoning was outlined in [1] and
[2].

As pointed by Julian Wiedmann, [3] resolved the following NAT reply
issue:

    On the remote node, the reply (dst=the client node IP) gets
    masqueraded by the BPF-masq feature, because we masquerade pod ->
    remote host IP in the tunnel mode (see comment in the
    "snat_v4_needed()" for the reason), and currently we don't consult
    the CT map to see whether a packet is reply.

Thus, we can remove the check.

[1]: #15437
[2]: 50e59c3
[3]: #17168

Signed-off-by: Martynas Pumputis <m@lambda.lt>
brb added a commit that referenced this pull request Jul 11, 2024
Previously, we required socketLB to be enabled in order for BPF
masquerade to properly function. The reasoning was outlined in [1] and
[2].

As pointed by Julian Wiedmann, [3] resolved the following NAT reply
issue:

    On the remote node, the reply (dst=the client node IP) gets
    masqueraded by the BPF-masq feature, because we masquerade pod ->
    remote host IP in the tunnel mode (see comment in the
    "snat_v4_needed()" for the reason), and currently we don't consult
    the CT map to see whether a packet is reply.

Thus, we can remove the check.

[1]: #15437
[2]: 50e59c3
[3]: #17168

Signed-off-by: Martynas Pumputis <m@lambda.lt>
github-merge-queue bot pushed a commit that referenced this pull request Jul 15, 2024
Previously, we required socketLB to be enabled in order for BPF
masquerade to properly function. The reasoning was outlined in [1] and
[2].

As pointed by Julian Wiedmann, [3] resolved the following NAT reply
issue:

    On the remote node, the reply (dst=the client node IP) gets
    masqueraded by the BPF-masq feature, because we masquerade pod ->
    remote host IP in the tunnel mode (see comment in the
    "snat_v4_needed()" for the reason), and currently we don't consult
    the CT map to see whether a packet is reply.

Thus, we can remove the check.

[1]: #15437
[2]: 50e59c3
[3]: #17168

Signed-off-by: Martynas Pumputis <m@lambda.lt>
sayboras pushed a commit that referenced this pull request Jul 16, 2024
[ upstream commit 3de8537 ]

Previously, we required socketLB to be enabled in order for BPF
masquerade to properly function. The reasoning was outlined in [1] and
[2].

As pointed by Julian Wiedmann, [3] resolved the following NAT reply
issue:

    On the remote node, the reply (dst=the client node IP) gets
    masqueraded by the BPF-masq feature, because we masquerade pod ->
    remote host IP in the tunnel mode (see comment in the
    "snat_v4_needed()" for the reason), and currently we don't consult
    the CT map to see whether a packet is reply.

Thus, we can remove the check.

[1]: #15437
[2]: 50e59c3
[3]: #17168

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
github-merge-queue bot pushed a commit that referenced this pull request Jul 16, 2024
[ upstream commit 3de8537 ]

Previously, we required socketLB to be enabled in order for BPF
masquerade to properly function. The reasoning was outlined in [1] and
[2].

As pointed by Julian Wiedmann, [3] resolved the following NAT reply
issue:

    On the remote node, the reply (dst=the client node IP) gets
    masqueraded by the BPF-masq feature, because we masquerade pod ->
    remote host IP in the tunnel mode (see comment in the
    "snat_v4_needed()" for the reason), and currently we don't consult
    the CT map to see whether a packet is reply.

Thus, we can remove the check.

[1]: #15437
[2]: 50e59c3
[3]: #17168

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BPF-masq unexpectedly SNATs replies to outside
7 participants