-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.priority/awaiting-more-evidenceLowest priority. Possibly useful, but not yet enough support to actually get it done.Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Milestone
Description
📢 This issue has been addressed in cert-manager 1.18.1: https://github.com/cert-manager/cert-manager/releases/tag/v1.18.1
ℹ Read the cert-manager 1.18 release-notes to learn more.
Describe the bug:
Issuing a new certificate fails on error waiting for authorization
and unexpected non-ACME API error
.
controller logs:
I1005 19:55:21.630885 1 conditions.go:203] Setting lastTransitionTime for Certificate "<DOMAIN>-tls" condition "Ready" to 2024-10-05 19:55:21.630871806 +0000 UTC m=+276.682156984
I1005 19:55:21.631690 1 trigger_controller.go:223] "Certificate must be re-issued" logger="cert-manager.controller" key="gateway/<DOMAIN>-tls" reason="DoesNotExist" message="Issuing certificate as Secret does not exist"
I1005 19:55:21.636969 1 conditions.go:203] Setting lastTransitionTime for Certificate "<DOMAIN>-tls" condition "Issuing" to 2024-10-05 19:55:21.636948435 +0000 UTC m=+276.688233644
I1005 19:55:21.657677 1 controller.go:152] "re-queuing item due to optimistic locking on resource" logger="cert-manager.controller" error="Operation cannot be fulfilled on certificates.cert-manager.io \"<DOMAIN>-tls\": the object has been modified; please apply your changes to the latest version and try again"
I1005 19:55:21.657748 1 trigger_controller.go:223] "Certificate must be re-issued" logger="cert-manager.controller" key="gateway/<DOMAIN>-tls" reason="DoesNotExist" message="Issuing certificate as Secret does not exist"
I1005 19:55:21.657774 1 conditions.go:203] Setting lastTransitionTime for Certificate "<DOMAIN>-tls" condition "Issuing" to 2024-10-05 19:55:21.657766356 +0000 UTC m=+276.709051535
I1005 19:55:22.310137 1 conditions.go:263] Setting lastTransitionTime for CertificateRequest "<DOMAIN>-tls-1" condition "Approved" to 2024-10-05 19:55:22.310126399 +0000 UTC m=+277.361411567
I1005 19:55:22.334507 1 conditions.go:263] Setting lastTransitionTime for CertificateRequest "<DOMAIN>-tls-1" condition "Ready" to 2024-10-05 19:55:22.334488823 +0000 UTC m=+277.385774012
W1005 19:55:23.364319 1 warnings.go:70] metadata.finalizers: "finalizer.acme.cert-manager.io": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers
I1005 19:55:23.573713 1 pod.go:71] "creating HTTP01 challenge solver pod" logger="cert-manager.controller.http01.ensurePod" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01"
I1005 19:55:23.636553 1 httproute.go:67] "getting httpRoutes for challenge" logger="cert-manager.controller.http01.getGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:23.636622 1 httproute.go:47] "creating HTTPRoute for challenge" logger="cert-manager.controller.http01.ensureGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:23.652780 1 pod.go:59] "found one existing HTTP01 solver pod" logger="cert-manager.controller.http01.selfCheck.http01.ensurePod" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-88cmh" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.652854 1 service.go:45] "found one existing HTTP01 solver Service for challenge resource" logger="cert-manager.controller.http01.selfCheck.http01.ensureService" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-gfhk7" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.652889 1 httproute.go:67] "getting httpRoutes for challenge" logger="cert-manager.controller.http01.selfCheck.http01.getGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:23.652934 1 httproute.go:55] "Found existing HTTPRoute for challenge" logger="cert-manager.controller.http01.selfCheck.http01.ensureGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
E1005 19:55:23.719614 1 sync.go:208] "propagation check failed" err="wrong status code '404', expected '200'" logger="cert-manager.controller" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01"
I1005 19:55:23.737737 1 pod.go:59] "found one existing HTTP01 solver pod" logger="cert-manager.controller.http01.selfCheck.http01.ensurePod" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-88cmh" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.737816 1 service.go:45] "found one existing HTTP01 solver Service for challenge resource" logger="cert-manager.controller.http01.selfCheck.http01.ensureService" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-gfhk7" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.737856 1 httproute.go:67] "getting httpRoutes for challenge" logger="cert-manager.controller.http01.selfCheck.http01.getGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:23.738070 1 httproute.go:55] "Found existing HTTPRoute for challenge" logger="cert-manager.controller.http01.selfCheck.http01.ensureGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
E1005 19:55:23.760244 1 sync.go:208] "propagation check failed" err="wrong status code '503', expected '200'" logger="cert-manager.controller" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01"
I1005 19:55:23.773927 1 pod.go:59] "found one existing HTTP01 solver pod" logger="cert-manager.controller.http01.selfCheck.http01.ensurePod" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-88cmh" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.774259 1 service.go:45] "found one existing HTTP01 solver Service for challenge resource" logger="cert-manager.controller.http01.selfCheck.http01.ensureService" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-gfhk7" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:23.774420 1 httproute.go:67] "getting httpRoutes for challenge" logger="cert-manager.controller.http01.selfCheck.http01.getGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:23.774576 1 httproute.go:55] "Found existing HTTPRoute for challenge" logger="cert-manager.controller.http01.selfCheck.http01.ensureGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
E1005 19:55:23.791886 1 sync.go:208] "propagation check failed" err="wrong status code '503', expected '200'" logger="cert-manager.controller" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01"
I1005 19:55:33.721279 1 pod.go:59] "found one existing HTTP01 solver pod" logger="cert-manager.controller.http01.selfCheck.http01.ensurePod" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-88cmh" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:33.721950 1 service.go:45] "found one existing HTTP01 solver Service for challenge resource" logger="cert-manager.controller.http01.selfCheck.http01.ensureService" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" related_resource_name="cm-acme-http-solver-gfhk7" related_resource_namespace="gateway" related_resource_kind="" related_resource_version=""
I1005 19:55:33.722048 1 httproute.go:67] "getting httpRoutes for challenge" logger="cert-manager.controller.http01.selfCheck.http01.getGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
I1005 19:55:33.722158 1 httproute.go:55] "Found existing HTTPRoute for challenge" logger="cert-manager.controller.http01.selfCheck.http01.ensureGatewayHTTPRoute" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01" name="<DOMAIN>-tls-1-2127692413-1871217317" namespace="gateway"
E1005 19:56:01.969322 1 sync.go:403] "error waiting for authorization" err="context deadline exceeded" logger="cert-manager.controller.acceptChallenge" resource_name="<DOMAIN>-tls-1-2127692413-1871217317" resource_namespace="gateway" resource_kind="Challenge" resource_version="v1" dnsName="<DOMAIN>" type="HTTP-01"
E1005 19:56:01.969387 1 sync.go:240] "unexpected non-ACME API error" err="context deadline exceeded"
E1005 19:56:01.979482 1 controller.go:157] "re-queuing item due to error processing" err="context deadline exceeded" logger="cert-manager.controller"
cert resources:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
creationTimestamp: "2024-10-05T19:55:21Z"
generation: 1
name: <DOMAIN>-tls
namespace: gateway
ownerReferences:
- apiVersion: gateway.networking.k8s.io/v1
blockOwnerDeletion: true
controller: true
kind: Gateway
name: main-gw
uid: f294ac2e-0067-4041-8182-02a7e6a6a79e
resourceVersion: "107070782"
uid: d3a5eb42-25a3-4b9b-ba6b-130d90e96f32
spec:
dnsNames:
- <DOMAIN>
issuerRef:
group: cert-manager.io
kind: Issuer
name: letsencrypt
secretName: <DOMAIN>-tls
usages:
- digital signature
- key encipherment
status:
conditions:
- lastTransitionTime: "2024-10-05T19:55:21Z"
message: Issuing certificate as Secret does not exist
observedGeneration: 1
reason: DoesNotExist
status: "False"
type: Ready
- lastTransitionTime: "2024-10-05T20:11:55Z"
message: 'The certificate request has failed to complete and will be retried:
Failed to wait for order resource "<DOMAIN>-tls-1-2127692413" to become ready:
order is in "invalid" state: '
observedGeneration: 1
reason: Failed
status: "False"
type: Issuing
failedIssuanceAttempts: 1
lastFailureTime: "2024-10-05T20:11:55Z"
apiVersion: cert-manager.io/v1
kind: CertificateRequest
metadata:
annotations:
cert-manager.io/certificate-name: <DOMAIN>-tls
cert-manager.io/certificate-revision: "1"
cert-manager.io/private-key-secret-name: <DOMAIN>-tls-59vtw
creationTimestamp: "2024-10-05T19:55:22Z"
generation: 1
name: <DOMAIN>-tls-1
namespace: gateway
ownerReferences:
- apiVersion: cert-manager.io/v1
blockOwnerDeletion: true
controller: true
kind: Certificate
name: <DOMAIN>-tls
uid: d3a5eb42-25a3-4b9b-ba6b-130d90e96f32
resourceVersion: "107070778"
uid: 7e70e4a5-4421-449a-bf0d-092d33bdcef1
spec:
extra:
authentication.kubernetes.io/credential-id:
- JTI=bb18205d-<>-<>-<>-4efb1dbbe544
authentication.kubernetes.io/node-name:
- node1
authentication.kubernetes.io/node-uid:
- 6e297928-<>-<>-<>-8e4d5507a8dc
authentication.kubernetes.io/pod-name:
- cert-manager-855d849766-n9xhx
authentication.kubernetes.io/pod-uid:
- 80bac62a-<>-<>-<>-2a7923de992c
groups:
- system:serviceaccounts
- system:serviceaccounts:cert-manager
- system:authenticated
issuerRef:
group: cert-manager.io
kind: Issuer
name: letsencrypt
request: LS0tL<>LQo=
uid: 8aa51f82-<>-<>-<>-096272e1bc08
usages:
- digital signature
- key encipherment
username: system:serviceaccount:cert-manager:cert-manager
status:
conditions:
- lastTransitionTime: "2024-10-05T19:55:22Z"
message: Certificate request has been approved by cert-manager.io
reason: cert-manager.io
status: "True"
type: Approved
- lastTransitionTime: "2024-10-05T19:55:22Z"
message: 'Failed to wait for order resource "<DOMAIN>-tls-1-2127692413" to become
ready: order is in "invalid" state: '
reason: Failed
status: "False"
type: Ready
failureTime: "2024-10-05T20:11:55Z"
Increasing verbosity level does not help to troubleshoot the problem.
Expected behaviour:
Issue certificate by ACME issuer.
Steps to reproduce the bug:
helm values:
config:
apiVersion: controller.config.cert-manager.io/v1alpha1
enableGatewayAPI: true
kind: ControllerConfiguration
crds:
enabled: true
Issuer:
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt
namespace: gateway
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-issuer-account-key
solvers:
- http01:
gatewayHTTPRoute:
parentRefs:
- name: main-gw
namespace: gateway
kind: Gateway
Gateway:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: main-gw
namespace: gateway
annotations:
cert-manager.io/issuer: letsencrypt
spec:
gatewayClassName: cilium
listeners:
- name: https
hostname: <DOMAIN>
port: 443
protocol: HTTPS
allowedRoutes:
namespaces:
from: All
tls:
mode: Terminate
certificateRefs:
- name: <DOMAIN>-tls
- name: http
hostname: <DOMAIN>
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: All
If cert-manager is downgraded to 1.15.3, certificate is issued fined as expected.
Anything else we need to know?:
Environment details::
- Kubernetes version: v1.30.4
- Cloud-provider/provisioner: N/A
- cert-manager version: 1.16.0
- Install method: e.g. helm/static manifests HELM
- Cilium 1.16.2
- GatewayAPI 1.1
/kind bug
andyrue, JannikJ, dataway, naeramarth7, isaac-mcfadyen and 7 more
Metadata
Metadata
Assignees
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.priority/awaiting-more-evidenceLowest priority. Possibly useful, but not yet enough support to actually get it done.Lowest priority. Possibly useful, but not yet enough support to actually get it done.