Skip to content

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Apr 15, 2020

See commit msg.

Fixes: #10942
Fixes: f25d8b9 ("bpf: Preserve source identity for hairpin via stack")
Reported-by: Martynas Pumputis m@lambda.lt
Signed-off-by: Daniel Borkmann daniel@iogearbox.net

@borkmann borkmann added pending-review area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels Apr 15, 2020
@borkmann borkmann requested review from brb, jrfastab and a team April 15, 2020 23:47
@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

test-focus K8sService.*

@borkmann borkmann added area/kube-proxy-free kind/bug This is a bug in the Cilium logic. labels Apr 15, 2020
Commit f25d8b9 ("bpf: Preserve source identity for hairpin via stack")
recently broke BPF NodePort since the use of MARK_MAGIC_SNAT_DONE is now in
conflict/overlapping with MARK_MAGIC_IDENTITY.

In tc egress hooks of bpf_{netdev,overlay} we have a check whether the SNAT
is not needed via if ((ctx->mark & MARK_MAGIC_SNAT_DONE) == MARK_MAGIC_SNAT_DONE).

MARK_MAGIC_SNAT_DONE is 0x0500 whereas MARK_MAGIC_IDENTITY is 0x0F00. So far
it was never a problem since MARK_MAGIC_IDENTITY was not used in a path where
MARK_MAGIC_SNAT_DONE was tested until this changed via f25d8b9 where now
SNAT was skipped for Pod traffic.

As a result, we've seen various flakes in SNAT or Hybrid NodePort setups mostly
on the tftp/UDP tests. Reverting f25d8b9 confirmed to fix these flakes. As
a workaround/fix, remap MARK_MAGIC_SNAT_DONE to 0x1500, so that setting the
MARK_MAGIC_IDENTITY will get a mismatch on above mentioned test and hence we'll
end up doing the SNAT.

This reaches into MARK_MAGIC_KEY_ID bit space, but the agent today cannot be
configured to run with both BPF NodePort and IPSec at the same time.

Fixes: #10942
Fixes: f25d8b9 ("bpf: Preserve source identity for hairpin via stack")
Reported-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann force-pushed the pr/fix-snat-marker branch from f83714d to f89a91a Compare April 16, 2020 00:03
@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

test-focus K8sService.*

@coveralls
Copy link

coveralls commented Apr 16, 2020

Coverage Status

Coverage increased (+0.02%) to 46.813% when pulling f89a91a on pr/fix-snat-marker into 4ce5a4e on master.

@borkmann
Copy link
Member Author

test-gke

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We should revisit the mark once we start supporting NodePort BPF + transparent encryption.

@borkmann borkmann merged commit f2bcb69 into master Apr 16, 2020
@borkmann borkmann deleted the pr/fix-snat-marker branch April 16, 2020 09:02
@borkmann
Copy link
Member Author

(Needs to be backported where also #10926 gets backported.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. kind/bug This is a bug in the Cilium logic. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: Suite-k8s-1.11.K8sServicesTest Checks service across nodes Tests NodePort BPF Tests with XDP, direct routing and Hybrid: Exit code 28
7 participants