Skip to content

policy-cidr-match-mode: nodes broken with cilium 1.17.x and cidr 0.0.0.0/0 #39656

@networkhell

Description

@networkhell

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.17.4 and lower than v1.18.0

What happened?

Environment:

k8s v1.31.5
cilium: v1.17.4 

The matching behaviour of k8s networkpolicies changed with cilium version 1.17.x!
We used the following policy to "permit" all to and from a given pod:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-everything
  namespace: test
spec:
  podSelector: {}
  ingress:
  - from:
    - namespaceSelector: {}
      podSelector: {}
    - ipBlock:
        cidr: 0.0.0.0/0
    - ipBlock:
        cidr: ::/0
  egress:
  - to:
    - namespaceSelector: {}
      podSelector: {}
    - ipBlock:
        cidr: 0.0.0.0/0
    - ipBlock:
        cidr: ::/0
  policyTypes:
  - Ingress
  - Egress

Starting with v1.17.x this policy does not longer work as allow all policy. In my case access to node port services seems not to be allowed. ClusterIP services, pod IPs and external IPs are still allowed. Instead I have to use the following policy to get things to work. Note that policy-cidr-match-mode: nodes is set!

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-everything
  namespace: test
spec:
  podSelector: {}
  ingress:
  - {}
  egress:
  - {}
  policyTypes:
  - Ingress
  - Egress

It is worth to note that this a.) works with cilium < 1.17.0 and b.) the issue only hits us when using the cidr 0.0.0.0/0 or ::/0. If I target e.g. the node network /24 then connections are allowed as intended.

Any hint why this behaviour has changed? And is there any chance to fix this or do users have to work around this? It currently breaks a lot of deployments with bundled network policies as they are often written in the style mentioned above.

How can we reproduce the issue?

  1. install cilium with helm
  2. use helm option policyCIDRMatchMode: nodes
  3. create a deny all network policy
  4. create the network policy mentioned above
  5. create a pod
  6. observe that connections to kube-api server are blocked

Cilium Version

1.17.4

Kernel Version

6.6.88

Kubernetes Version

1.32.5

Regression

It works with 1.16.9 and below...

Sysdump

Upload of sysdump fails... probably due to a size of 35 MB

Relevant log output

kubectl exec -n source -i -t alps nicolaka/netshoot -- /usr/bin/nc -zvw1 10.96.0.1 443
nc: 10.96.0.1 (10.96.0.1:443): Operation timed out
command terminated with exit code 1

hubble observe --all | grep alps
May 21 09:44:08.307: source/alps (ID:48329) <> default/kubernetes:443 (world) pre-xlate-fwd TRACED (TCP)
May 21 09:44:08.307: source/alps (ID:48329) <> 10.42.0.207:6443 (ID:33554461) post-xlate-fwd TRANSLATED (TCP)
May 21 09:44:08.307: source/alps:35979 (ID:48329) <> 10.42.0.207:6443 (ID:33554461) policy-verdict:none EGRESS DENIED (TCP Flags: SYN)
May 21 09:44:08.307: source/alps:35979 (ID:48329) <> 10.42.0.207:6443 (ID:33554461) Policy denied DROPPED (TCP Flags: SYN)

Anything else?

There is another open issue that is not flagged as bug but describes more or less the same behaviour but is only focused on connections to kube-api-server #39573

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

area/agentCilium agent related.kind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.kind/regressionThis functionality worked fine before, but was broken in a newer release of Cilium.severity/highWidespread impact to common deployment configurations.sig/policyImpacts whether traffic is allowed or denied based on user-defined policies.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions