Skip to content

CFP: Fix Ingress traffic interactions with other Policy enforcement #24536

@youngnick

Description

@youngnick

Background

Currently, Cilium's Ingress support has implications when interacting with other Policies that are a bit surprising and aren't well explained.

When traffic arrives at the per-node Envoy from outside the cluster (in Cilium identity speak, it has a world identity) to a listener created by Cilium's Ingress (or Gateway API) support, it picks up the special ingress identity during its routing through Envoy. This identity can then interact with Network Policy (whether they are CiliumNetworkPolicy or ClusterWideCiliumNetworkPolicy) by being used in an ingress or egress policy.

Note
Yes, the fact that traffic coming into the cluster is called ingress, as well as traffic going to an identity is confusing. This term overloading has come from two different directions and is tricky to fix. In this issue, Ingress means the Kubernetes resource Ingress, and other uses will be disambiguated with a modifier like "ingress identity", "ingress policy", or "ingress traffic".

However, when traffic arrives at the per-node Envoy from inside the cluster, and is bound for an ingress listener, the original security identity is maintained. This is because Cilium's Ingress support is built on our Layer 7 Policy support, and uses the same Envoys. This means that if users wish to have in-cluster workloads talk to the "outside" of their Ingress or Gateway API listeners and have Network Policy that restricts traffic between workloads, they may need to allow an unintendedly large amount of traffic in a Network Policy.

This has been the cause of some confusion and outages with a number of Cilium users, and the number will only grow as more people use Cilium's Ingress and Gateway API support.

In an old-school, physical network, that had a dedicated firewall device listening on certain open ports on an internet connection, this type of connection was called "hairpinning" (like a hairpin turn on a racetrack), and had a series of special options.

In Cilium's case, fully supporting Ingress hairpinning could safely have internal traffic pick up the ingress identity, because the Ingress and Gateway API listeners are effectively performing some Policy enforcement themselves. This is because both Ingress and Gateway API will only route requested traffic to their defined backends - they are effectively an allowlist for very specific Layer 7 traffic when passing through the per-node Envoys.

Proposed changes

This CFP proposes two changes:

  • Firstly, all traffic passing through an Ingress or Gateway API listener should assume the ingress identity. This includes traffic originating inside the cluster, and will mean that traffic from inside will lose its internal identity. This is hairpinning for internal traffic.
  • Secondly, the documentation will be updated to call out this interaction, and make it clear that by having traffic originating from inside the cluster go to an Ingress listener, it's logically "leaving" the cluster and becoming ingress identity traffic instead.

Metadata

Metadata

Labels

area/agentCilium agent related.area/servicemeshGH issues or PRs regarding servicemeshkind/featureThis introduces new functionality.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions