-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Cilium Feature Proposal
This "CFP" is about the interaction between network policies, clustermesh and already provides two possible paths to improve the current situation. See this as taking a step back about policies and clustermesh interaction, collecting the pros/cons about the current ways of dealing with that interaction and trying to determine if we should keep the current behavior and improve it or change it.
Problem/context statement
Policies in a clustermesh environment allow by default the traffic to all clusters endpoints also matching the same labels unless if the cluster name is explictly specified (see https://docs.cilium.io/en/latest/network/clustermesh/policy/#allowing-specific-communication-between-clusters). This does have various usability and/or security complication such as:
- a policy made for an "isolated" cluster can't be directly applied/transposed in a clustermesh cluster
- it's easy to over reach to allow traffic to all clusters instead of only the local cluster
- to actually restrict to a local cluster a policy must be self aware of the local cluster name
This could affect any policy and in particular I think the most notable scenario could be related to CCNP, regular Network policies that are "packaged" with some third party deployments (for instance an helm chart) that a user may install and probably host policies (?).
On a more positive note this behavior of allowing clustermesh traffic by default does lower the barrier between clusters and could in a way make it easier to adopt clustermesh.
Possible Solutions
Keep the current behavior
One solution could be to keep the current behavior. In that case, we should most likely also make it extra clear in the docs that the traffic is authorized for all clusters by default and do this in several places so that a maximum of users would notice and understand this.
And it would be nice to create a way for a policy to be able to reference his local cluster without knowing its name. For instance, one way to do that could be to have a special value @self
or @local
that would be translated to the local cluster name. Something like that for instance:
apiVersion: "[cilium.io/v2](http://cilium.io/v2)"
kind: CiliumNetworkPolicy
metadata:
name: "allow-local-cluster"
spec:
description: "Allow x-wing to contact rebel-base in local cluster"
endpointSelector:
matchLabels:
name: x-wing
egress:
- toEndpoints:
- matchLabels:
name: rebel-base
io.cilium.k8s.policy.cluster: "@self"
There might be a better way to express this in another way as this way would be a new pattern in the cilium policy system.
Change the default to assume the local cluster
A bit like the namespace is auto assumed/selected for a CNP or regular NetworkPolicy, we could make the cluster name also selected by default unless the user overrides and explicitly selects a cluster. Hopefully we could use a similar implementation to avoid all the possible pitfalls.
Selecting all the clusters would be similar to the current way to select all the namespace with a match expression by checking if the label io.cilium.k8s.policy.cluster
exists like so:
matchExpressions:
- key: io.cilium.k8s.policy.cluster
operator: Exists
This doesn't seems to be possible for regular NetworkPolicies from "networking.k8s.io", we should clarify in that case what would be the behavior. For instance we could restrict to the local cluster by default and suggest to use to CNP otherwise.
If we were doing something like this, the breaking change could be handled with a new flag to enable/disable this new behavior. For instance, the new behavior could be enabled by default and we keep this option to disable it for at least 1 or 2 Cilium versions.