Skip to content

Conversation

borkmann
Copy link
Member

See commit msg

…tion

Add an agent knob for expert users who wish to run backend Pods on the
same ports as the NodePort service is exposed. We recently resolved this
by making cgroup BPF hooks netns aware in the kernel and Cilium via work
from d7bf451 ("bpf: fix init namespace handling in sock lb programs").
This works for 5.7 kernels.

However, if there is a need for older kernels to deploy Pods in this way
(or to opt out of the bind hooks entirely for different reasons), then one
can control this setting through --node-port-bind-protection. The knob
defaults to true (same as we do today), so that expert users can opt-out
from it if they need to.

With bind protection enabled:

  # ./daemon/cilium-agent  [...]
  # bpftool cgroup tree
  CgroupPath
  ID       AttachType      AttachFlags     Name
  /sys/fs/cgroup/unified
  48126    connect4
  48122    connect6
  48127    post_bind4   <---+ bind prevention enabled by default
  48123    post_bind6   <---`
  48128    sendmsg4
  48124    sendmsg6
  48129    recvmsg4
  48125    recvmsg6
  [...]

With bind protection disabled:

  # ./daemon/cilium-agent  [...] --node-port-bind-protection=false
  # bpftool cgroup tree
  CgroupPath
  ID       AttachType      AttachFlags     Name
  /sys/fs/cgroup/unified
  49607    connect4
  49604    connect6
  49608    sendmsg4
  49605    sendmsg6
  49609    recvmsg4
  49606    recvmsg6
  [...]

Reported-by: Rene Zbinden <rene.zbinden@postfinance.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann added pending-review area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Apr 28, 2020
@borkmann borkmann requested review from brb and a team April 28, 2020 12:21
@borkmann borkmann requested a review from a team as a code owner April 28, 2020 12:21
@borkmann borkmann requested a review from a team April 28, 2020 12:21
@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

3 similar comments
@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

@borkmann
Copy link
Member Author

test-docs-please

@borkmann
Copy link
Member Author

test-me-please

@borkmann borkmann requested a review from gandro April 28, 2020 12:34
@borkmann borkmann force-pushed the pr/bind-protection branch from e8c6936 to ecd8050 Compare April 28, 2020 12:44
@borkmann
Copy link
Member Author

test-me-please

@coveralls
Copy link

coveralls commented Apr 28, 2020

Coverage Status

Coverage decreased (-0.02%) to 44.632% when pulling 3da5425 on pr/bind-protection into c0ef36f on master.

@borkmann
Copy link
Member Author

restart-gke

@borkmann
Copy link
Member Author

Hit #10118 on GKE

@borkmann
Copy link
Member Author

test-gke

@borkmann
Copy link
Member Author

test-gke

@borkmann
Copy link
Member Author

From GKE CI:

20:53:38  Step 1/14 : FROM docker.io/library/golang:1.14.2 as builder
20:53:38   ---> 2421885b04da
20:53:38  Step 2/14 : LABEL maintainer="maintainer@cilium.io"
20:53:38   ---> Using cache
20:53:38   ---> 8e0caf011e36
20:53:38  Step 3/14 : ADD . /go/src/github.com/cilium/cilium
20:54:38  failed to export image: failed to set parent sha256:8e0caf011e36f6dd297230873e7f61a2d1f6a560db4e1584f905339950408ffb: unknown parent image ID sha256:8e0caf011e36f6dd297230873e7f61a2d1f6a560db4e1584f905339950408ffb
20:54:38  Makefile:253: recipe for target 'docker-operator-image' failed
20:54:38  make: *** [docker-operator-image] Error 1

@borkmann
Copy link
Member Author

test-gke

@borkmann
Copy link
Member Author

Hit #10231 on GKE

@borkmann
Copy link
Member Author

test-gke

@pchaigno
Copy link
Member

Add an agent knob for expert users who wish to run backend Pods on the same ports as the NodePort service is exposed.

I'm probably missing some context, but is this still needed once #11210 is merged? Or is this still needed for pods in the host network?

@borkmann
Copy link
Member Author

Add an agent knob for expert users who wish to run backend Pods on the same ports as the NodePort service is exposed.

I'm probably missing some context, but is this still needed once #11210 is merged? Or is this still needed for pods in the host network?

Yes , if users really want to opt-out on older kernels where this feature is not available.

@borkmann
Copy link
Member Author

borkmann commented Apr 29, 2020

(All tests green, GKE one is randomly flaking & failing in unrelated areas. Only updating the doc nit from Tobias right now. GKE tests are running 4.14.138 kernel which does not run BPF nodeport anyway.)

Add related documentation and options for helm to aid configuration
for the bindProtection setting.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann force-pushed the pr/bind-protection branch from ecd8050 to 3da5425 Compare April 29, 2020 08:00
@borkmann
Copy link
Member Author

test-docs-please

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@borkmann borkmann merged commit d474e71 into master Apr 29, 2020
@borkmann borkmann deleted the pr/bind-protection branch April 29, 2020 08:51
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this requires backport?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants