Skip to content

Design: support Gateway API's new ListenerSet #7839

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

maelvls
Copy link
Member

@maelvls maelvls commented Jul 3, 2025

Design file: 20250703.gatewayapi-listenerset.md

Pull Request Motivation

I'd like to propose a design to address:

This design supersedes two designs:

/kind design

Release Note

NONE

Signed-off-by: Maël Valais <mael@vls.dev>
@cert-manager-prow cert-manager-prow bot added kind/design Categorizes issue or PR as related to design. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 3, 2025
Signed-off-by: Maël Valais <mael@vls.dev>
maelvls added 2 commits July 4, 2025 10:44
…ior will be

Signed-off-by: Maël Valais <mael@vls.dev>
Signed-off-by: Maël Valais <mael@vls.dev>
@cert-manager-prow cert-manager-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 7, 2025
@maelvls maelvls changed the title Proposal: support Gateway API's new ListenerSet Design: support Gateway API's new ListenerSet Jul 8, 2025
@maelvls maelvls requested a review from wallrj July 8, 2025 16:22
Copy link
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maelvls

This looks great.

I haven't tried the Gateway API examples myself, but I will. I'd like to try updating the getting started tutorials to use Gateway API instead of Ingress so that I can understand all this better.

@cert-manager-prow cert-manager-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 9, 2025
maelvls and others added 3 commits July 9, 2025 20:48
Signed-off-by: Maël Valais <mael@vls.dev>
Co-authored-by: Richard Wall <wallrj@users.noreply.github.com>
…nerSet in example

Signed-off-by: Maël Valais <mael@vls.dev>
Signed-off-by: Maël Valais <mael@vls.dev>

Two workarounds have been found by cluster operators:

- **Using a wildcard certificate as hostname on the Gateway:** this solution introduces risks associated with wildcard certificates (cf. [OWASP notes](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html#carefully-consider-the-use-of-wildcard-certificates) on using wildcard certificates).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel compelled to point out that the point of allowing wildcard certificates on the Gateway at all was to minimize the risk of exposure of the wildcard key. That's one of the primary reasons why we built ReferenceGrant - so that very-secure keys could be consumed by Gateway owners without being able to be read by those same Gateway owners.

That said, I don't think that changes this recommendation: There are risks associated with wildcard certificates that need to be managed, and that using cert-manager to manage them currently doesn't use a separate namespace and ReferenceGrant, so the concerns called out by OWASP do apply in cert-manager's case.

| NGINX Gateway Fabric | ✅ Yes | Gateways reuse the same NGINX Deployment; no extra pods or infra per Gateway. |
| Envoy Gateway | ✅ Yes | Gateways share the same Envoy fleet unless you configure isolation; no new pods. Usually, people will use `mergeGateways` whereby all Gateways referring to the same GatewayClass are merged and served by the same set of Envoy Proxies. |
| ingress-nginx (Gateway API mode) | ✅ Yes | Gateways reuse existing ingress-nginx pods; no new deployments are created per Gateway. |
| Cilium Service Mesh | ✅ Yes | Gateways are purely logical; routing happens in eBPF in kernel. No pods or infra created. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, sadly, not one hundred percent correct; it actually depends on how Loadbalancer Services are handled at Layer 4 in the cluster. If the cluster provisions new cloud LBs for each LB Service, then each Gateway will provision a new Cloud LB, because Gateways all create LB Services for attracting traffic to Cilium Nodes.

Gateway API support does not support a single shared LB like Ingress does (currently).

If Cilium's LB-IPAM is in use, then extra VIPs are basically free.

Again, with all of that said, it doesn't materially alter the calculation here that this is worth building.

So, FYI rather than blocking.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I agree with @youngnick here that the table isn't correct, but that the conclusion is.)

@youngnick
Copy link

This design makes sense to me, great to see.

I'd also encourage whoever implements this to make clear to users that this is for TESTING ONLY. The X prefix on XListenerSet is in case we need to make breaking changes. Experimental resources like that should not be used in production, and migrating from XListenerSet to ListenerSet when this goes to stable will require manual action (pulling down the object, changing the Kind and Group, and re-adding a new object). This is by design, as this is for testing only.

Implementing this now for cert-manager is amazing though, because it will allow implementations to be sure that one of the most common use cases for this flow is tested and validated before we move the whole object to Standard/Stable.

That's going to allow us to get this design right as early as possible, and hopefully be able to move forward XListenerSet into ListenerSet as soon as possible.

Thanks for this design @maelvls, nice work.

Copy link

@kflynn kflynn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, too.


ListenerSet provides a mechanism allowing developers to manage TLS configurations, restoring self-service capabilities akin to Ingress. The following diagram illustrates the fact that developers must now coordinate with cluster operators to configure the `tls` block (in green):

![gateway-with-manifests.excalidraw-fs8](https://hackmd.io/_uploads/r1rgH2tblx.png)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it could be really helpful to include the diagram showing the ListenerSet approach here, as well (https://hackmd.io/_uploads/B1PRYpKZgl.png) -- reading over this, my thought at this point was to wonder why you were talking about ListenerSet and then drawing about Listener. 🤔

| NGINX Gateway Fabric | ✅ Yes | Gateways reuse the same NGINX Deployment; no extra pods or infra per Gateway. |
| Envoy Gateway | ✅ Yes | Gateways share the same Envoy fleet unless you configure isolation; no new pods. Usually, people will use `mergeGateways` whereby all Gateways referring to the same GatewayClass are merged and served by the same set of Envoy Proxies. |
| ingress-nginx (Gateway API mode) | ✅ Yes | Gateways reuse existing ingress-nginx pods; no new deployments are created per Gateway. |
| Cilium Service Mesh | ✅ Yes | Gateways are purely logical; routing happens in eBPF in kernel. No pods or infra created. |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I agree with @youngnick here that the table isn't correct, but that the conclusion is.)


### Issuer Annotations

What about a Gateway resource with the `cert-manager.io/issuer` and a listener
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me very sad that this needs to be a thing, but I think that the way you've covered it makes the most sense.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, me too. But the only other option is to do some sort of Policy behavior, which would be as bad, and would break the existing-user experience too much.

@cert-manager-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kflynn, wallrj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/design Categorizes issue or PR as related to design. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants