-
Notifications
You must be signed in to change notification settings - Fork 3.4k
datapath: Create sysctl rp_filter
overwrite config on agent init
#20072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
SystemD versions greater than 245 will create sysctl config which sets the `rp_filter` value for all network interfaces to 1. This conflicts with cilium which requires `rp_filter` to be 0 on interfaces it uses. This commit adds a small utility/tool: `sysctlfix` which will insert a config file into the `/etc/sysctl.d` dir with the highest priority containing directives to disable `rp_filter` and perhaps to contain other sysctl config in future. This utility is called as an init container before the cilium agent starts. Because the sysctl config is in place before the agent starts, all interfaces created by the agent and matching the patten in the config file will have `rp_filter` disabled, even when SystemD >=245 is installed. Fixes: cilium#10645 Fixes: cilium#19909 Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💯
Below is from my local minikube, do we need to do something with
|
yeah, in our case from #19761, the proposed change is not enough to resolve the issue and |
@sayboras @rastislavs is the goal of adding
I have to add
|
@dylandreimerink -- I was reading this and was curious if this can be done in the CNI plugin itself (something like https://github.com/cilium/cilium/blob/master/plugins/cilium-cni/cilium-cni.go#L516-L527). I am not familiar with the problem itself, but wanted to offer a possible alternative way to achieving the goal (edit the sysctl and reload inside the netns). |
@ungureanuvladvictor that is something to look into. I just occurred to me that this might also affect Cilium in 'standalone LB' mode (@brb, @borkmann ?), setting this up via an init container might not be the way to go. Perhaps we should move this into the agent/daemon itself. |
CNI plugin is an interesting option, the awkward thing there is that the CNI binary is run semi-frequently and each time is an independent execution of the binary. So it'd just need to be idempotent. The nice thing about running in CNI plugin context is that it's running as an application directly on the host with privileges, so we shouldn't need to worry about the security context at all. Thinking about the option to move it into the agent, this becomes more tricky because then we're further expanding the set of things that the cilium-agent container can access at runtime, and we're trying to avoid granting blanket access to lots of the system at all times from the cilium-agent. Cilium needs to be pretty powerful already, but do we really need access to reconfigure random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
tools/sysctlfix/main.go
Outdated
|
||
var sysctlConfig = strings.Join([]string{ | ||
"# Disable rp_filter on Cilium interfaces since it may cause mangled packets to be dropped", | ||
"net.ipv4.conf.lxc*.rp_filter = 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked that we are completely disabling the filter. What about setting it to the Loose
mode? That's all we should need to pass the packets with asymmetric routing. Details - #13130. That's the value we are currently setting as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set it to disabled since that seemed to be the suggested fix in #10645. According too @aanm, this commit in SystemD is the source of the issue. In the commit the rp_filter
is changed from 1(Strict) to 2(Loose). In #10645 this commit is mentioned as the culprit (also a change to 2(Loose))
The description of 'Loose' mode is:
Loose mode as defined in RFC3704 Loose Reverse Path
Each incoming packet's source address is also tested against the FIB
and if the source address is not reachable via any interface
the packet check will fail.
I am not sure if there are common use cases in Cilium where we handle packets for source addresses which don't appear in the FIB, but it wouldn't surprise me if there are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cilium isn't the only entity that could be running on a host. The issues you've linked above indicate that systemd overwrites rp_filter
settings set by Cilium. Regardless, I would like for us to understand why we would need to disable source address checks altogether in case of Cilium.
That is a good point, hadn't considered that. Running the tool should be a one time thing, so we can make it part of the installation process/guide for the standalone LB if it ever becomes an issue.
Updating the config file frequently doesn't do any harm. But we should only trigger a reload of the systemd unit when it has actually changed, will take this into account when updating the tool, so we can move this logic to the CNI plugin if we wish. |
@dylandreimerink -- another thing that crossed my mind right now is that I don't think this approach is going to work when deploying Cilium into OpenShift. Over there there files on disk (esp config files for various systems [sshd, systemd, etc]) are configured via this and when updates are done outside this mechanism the node itself will get into a degraded state. Maybe this is not worth changing the approach for now but would be good to give it a try install into OpenShift and then run an OpenShift upgrade to see if the system will be happy. |
The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: cilium#20072 Fixes: cilium#20626 Signed-off-by: Tam Mach <tam.mach@cilium.io>
The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: #20072 Fixes: #20626 Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 6d920b2 ] [ backporter's notes: original file `install/kubernetes/cilium/templates/cilium-agent/daemonset.yaml` does not exist in v1.10 due to the Helm refactor from 8c49b83, changes were applied to `install/kubernetes/cilium/templates/cilium-agent-daemonset.yaml` instead. ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: cilium#20072 Fixes: cilium#20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: cilium#20072 Fixes: cilium#20626 Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 6d920b2 ] [ backporter's notes: conflicts due to privileges rework not having been backported to `v1.11`. ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: cilium#20072 Fixes: cilium#20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
[ upstream commit 6d920b2 ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: cilium#20072 Fixes: cilium#20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
[ upstream commit 6d920b2 ] [ backporter's notes: conflicts due to privileges rework not having been backported to `v1.11`. ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: #20072 Fixes: #20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
[ upstream commit 6d920b2 ] [ backporter's notes: original file `install/kubernetes/cilium/templates/cilium-agent/daemonset.yaml` does not exist in v1.10 due to the Helm refactor from 8c49b83, changes were applied to `install/kubernetes/cilium/templates/cilium-agent-daemonset.yaml` instead. ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: #20072 Fixes: #20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
[ upstream commit 6d920b2 ] The newly added init container (e.g. sysctl init) requires hostproc volume mount, however, this volume is only mounted based on the helm flag .Values.cgroup.autoMount.enabled. This commit is to make sure that such condition is added to avoid any failure. Relates: #20072 Fixes: #20626 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
SystemD versions greater than 245 will create sysctl config which sets
the
rp_filter
value for all network interfaces to 1. This conflictswith cilium which requires
rp_filter
to be 0 on interfaces it uses.This commit adds a small utility/tool:
sysctlfix
which will inserta config file into the
/etc/sysctl.d
dir with the highest prioritycontaining directives to disable
rp_filter
and perhaps to containother sysctl config in future.
This utility is called as an init container before the cilium agent
starts. Because the sysctl config is in place before the agent starts,
all interfaces created by the agent and matching the patten in the
config file will have
rp_filter
disabled, even when SystemD >=245 isinstalled.
Fixes: #10645
Fixes: #19909