-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.4 and lower than v1.17.0
What happened?
Hi! I have stumbled upon an issue where, if you have both fromEntities
and fromCIDRSet
specified in a single CiliumClusterwideNetworkPolicy (at least in the ingress spec), the latter overwrites the former, and entities are not included in the final aggregated policy, which in turn causes issues with various non-critical Kubernetes tools.
Example CIDR group and policy (IPs redacted):
---
apiVersion: cilium.io/v2alpha1
kind: CiliumCIDRGroup
metadata:
name: test-ccg
spec:
externalCIDRs:
- "1.2.3.4/32"
- "5.6.7.8/32"
---
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: talos
spec:
description: Allow access to Talos Linux apid and trustd
nodeSelector:
matchLabels: {}
ingress:
- fromEntities:
- cluster
toPorts:
- ports:
- port: "50000"
protocol: TCP
- port: "50001"
protocol: TCP
- fromCIDRSet:
- cidrGroupRef: test-ccg
toPorts:
- ports:
- port: "50000"
protocol: TCP
- port: "50001"
protocol: TCP
After applying, our Talos backup CronJob fails with dial tcp 10.99.85.21:50000: i/o timeout
, which is the address of the talos
service in the default
namespace.
Output of cilium monitor
:
Policy verdict log: flow 0x0 local EP ID 1524, remote ID 82933, proto 6, ingress, action deny, auth: disabled, match none, 10.244.5.44:52330 -> 10.10.13.201:50000 tcp SYN
Policy verdict log: flow 0x0 local EP ID 1524, remote ID 82933, proto 6, ingress, action deny, auth: disabled, match none, 10.244.5.44:52330 -> 10.10.13.201:50000 tcp SYN
Policy verdict log: flow 0x0 local EP ID 1524, remote ID 82933, proto 6, ingress, action deny, auth: disabled, match none, 10.244.5.44:52330 -> 10.10.13.201:50000 tcp SYN
Policy verdict log: flow 0x0 local EP ID 1524, remote ID 82933, proto 6, ingress, action deny, auth: disabled, match none, 10.244.5.44:52330 -> 10.10.13.201:50000 tcp SYN
A similar policy is applied for port 6443, and that results in many components, including metrics-server, various operators, Kyverno, and even hubble-generate-certs failing to connect to kube-apiserver.
Observing the output of cilium-dbg endpoint get <HOST_EP_ID> -o yaml
(default output results in #29247), I can see that the aggregated policy does not include reserved entities from the cluster
entity group - here's the status.policy.realized.l4
block:
- derivedfromrules:
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=talos
- k8s:io.cilium.k8s.policy.uid=09fd8860-e7f2-48b8-94c1-84ebde4d9fb0
rule: '{"port":50000,"protocol":"TCP","l7-rules":[{"\u0026LabelSelector{MatchLabels:map[string]string{cidr.1.2.3.4/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}":null},{"\u0026LabelSelector{MatchLabels:map[string]string{cidr.5.6.7.8/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}":null},{"\u0026LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[]LabelSelectorRequirement{},}":null}]}'
rulesbyselector:
'&LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=talos
- k8s:io.cilium.k8s.policy.uid=09fd8860-e7f2-48b8-94c1-84ebde4d9fb0
'&LabelSelector{MatchLabels:map[string]string{cidr.1.2.3.4/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=talos
- k8s:io.cilium.k8s.policy.uid=09fd8860-e7f2-48b8-94c1-84ebde4d9fb0
'&LabelSelector{MatchLabels:map[string]string{cidr.5.6.7.8/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=talos
- k8s:io.cilium.k8s.policy.uid=09fd8860-e7f2-48b8-94c1-84ebde4d9fb0
If I remove fromCIDRSet
from the policy and replace with with a more traditional fromCIDR
listing all the necessary IPs directly in the policy, everything works correctly - connections are let through and the entities appear in the aggregated policy:
? '&LabelSelector{MatchLabels:map[string]string{k8s.io.cilium.k8s.policy.cluster: test,},MatchExpressions:[]LabelSelectorRequirement{},}'
: - - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.health: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.host: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.ingress: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.init: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.kube-apiserver: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.remote-node: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
'&LabelSelector{MatchLabels:map[string]string{reserved.unmanaged: ,},MatchExpressions:[]LabelSelectorRequirement{},}':
- - k8s:io.cilium.k8s.policy.derived-from=CiliumClusterwideNetworkPolicy
- k8s:io.cilium.k8s.policy.name=controlplane
- k8s:io.cilium.k8s.policy.uid=48b548eb-2a12-45a7-95fa-6cfdd452c871
How can we reproduce the issue?
- Have Cilium installed with Helm with
hostFirewall.enabled=true
andpolicyAuditMode=false
values - Apply a CiliumCIDRGroup and a CiliumClusterwideNetworkPolicy which has both
fromEntities
andfromCIDRSet
in its ingress spec - Observe timeouts within various components and
deny
verdicts incilium monitor -t policy-verdict
Cilium Version
v1.16.4
Kernel Version
Linux 6.6.33-talos x86_64
Kubernetes Version
v1.28.7
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct