Skip to content

cert-manager created multiple CertificateRequest objects with the same certificate-revision #4956

@jayme-github

Description

@jayme-github

Describe the bug:
At Wikimedia we run cert-manager with our own issuer and a cfssl PKI: https://gerrit.wikimedia.org/g/operations/software/cfssl-issuer

We've got a staging cluster with short-lived certificates (24h) where I noticed one not being refreshed.
From the data in API objects is seems as if the certificate was triggered for renewal 2022-02-06T17:00:03Z which led to the creation of CertificateRequest/toolhub-l8xjm first (2022-02-06T17:01:59Z) and CertificateRequest/toolhub-wvz2q second (2022-02-06T17:04:19Z), both sharing the same cert-manager.io/certificate-revision: 49 and cert-manager.io/private-key-secret-name: toolhub-rrvtj . See kubernetes_objects.yaml

During that time we had some pretty elevated latency on the kubernetes apiserver. Mainly CREATE and UPDATE on cert-manager.io/certificaterequest resources and apparently some connectivity issues with the kubernetes apiserver, as seen in cert-manager.log below.

I0206 17:00:03.399283       1 conditions.go:201] Setting lastTransitionTime for Certificate "toolhub" condition "Issuing" to 2022-02-06 17:00:03.399274083 +0000 UTC m=+10735.693974692
I0206 17:00:03.399216       1 trigger_controller.go:181] cert-manager/controller/certificates-trigger "msg"="Certificate must be re-issued" "key"="istio-system/toolhub" "message"="Renewing certificate as renewal was scheduled at 2022-02-06 17:00:00 +0000 UTC" "reason"="Renewing"
E0206 17:00:20.630289       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=10s: context deadline exceeded" "key"="istio-system/toolhub" 
E0206 17:00:31.650166       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: context deadline exceeded" "key"="istio-system/toolhub" 
E0206 17:00:43.668839       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: context deadline exceeded" "key"="istio-system/toolhub" 
E0206 17:00:48.729312       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: dial tcp 10.192.76.185:443: connect: connection refused" "key"="istio-system/toolhub" 
E0206 17:00:57.816629       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: dial tcp 10.192.76.185:443: connect: connection refused" "key"="istio-system/toolhub" 
E0206 17:02:20.402566       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="istio-system/toolhub" 
E0206 17:06:26.899074       1 controller.go:163] cert-manager/controller/certificates-request-manager "msg"="re-queuing item due to error processing" "error"="failed whilst waiting for CertificateRequest to exist - this may indicate an apiserver running slowly. Request will be retried" "key"="istio-system/toolhub" 
I0206 17:07:34.907870       1 trace.go:205] Trace[861803746]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167 (06-Feb-2022 17:07:21.603) (total time: 13304ms):
Trace[861803746]: [13.30407288s] [13.30407288s] END
Trace[861803746]: ---"Objects listed" 13303ms (17:07:34.906)
I0206 17:07:57.199752       1 trace.go:205] Trace[782326899]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167 (06-Feb-2022 17:07:35.007) (total time: 22100ms):
Trace[782326899]: [22.100469212s] [22.100469212s] END
Trace[782326899]: ---"Objects listed" 22095ms (17:07:57.102)
I0206 17:08:05.299156       1 conditions.go:261] Setting lastTransitionTime for CertificateRequest "toolhub-l8xjm" condition "Approved" to 2022-02-06 17:08:04.500992384 +0000 UTC m=+44.594675153
I0206 17:08:05.199563       1 conditions.go:261] Setting lastTransitionTime for CertificateRequest "toolhub-wvz2q" condition "Approved" to 2022-02-06 17:08:03.500823508 +0000 UTC m=+43.594506243
I0206 17:08:05.904575       1 trace.go:205] Trace[850423915]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.21.3/tools/cache/reflector.go:167 (06-Feb-2022 17:07:35.006) (total time: 30897ms):
Trace[850423915]: ---"Objects listed" 30895ms (17:08:05.902)
Trace[850423915]: [30.897705171s] [30.897705171s] END
I0206 17:08:08.699125       1 requestmanager_controller.go:210] cert-manager/controller/certificates-request-manager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="istio-system/toolhub" 
I0206 17:08:09.201178       1 requestmanager_controller.go:210] cert-manager/controller/certificates-request-manager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="istio-system/toolhub" 
I0206 17:08:09.300472       1 requestmanager_controller.go:210] cert-manager/controller/certificates-request-manager "msg"="Multiple matching CertificateRequest resources exist, delete one of them. This is likely an error and should be reported on the issue tracker!" "key"="istio-system/toolhub" 
E0206 17:08:10.599180       1 controller.go:163] cert-manager/controller/certificates-readiness "msg"="re-queuing item due to error processing" "error"="multiple CertificateRequests were found for the 'next' revision 49, issuance is skipped until there are no more duplicates" "key"="istio-system/toolhub" 
E0206 17:08:13.400912       1 controller.go:163] cert-manager/controller/certificates-readiness "msg"="re-queuing item due to error processing" "error"="multiple CertificateRequests were found for the 'next' revision 49, issuance is skipped until there are no more duplicates" "key"="istio-system/toolhub"
...this message keeps repeating ever since
2022-02-06T17:02:00.026484+00:00 controller.certificaterequest istio-system toolhub-l8xjm CertificateRequest has not been approved yet. Ignoring.
2022-02-06T17:02:00.423743+00:00 controller.certificaterequest istio-system toolhub-l8xjm CertificateRequest has not been approved yet. Ignoring.
2022-02-06T17:04:19.921125+00:00 controller.certificaterequest istio-system toolhub-wvz2q CertificateRequest has not been approved yet. Ignoring.
2022-02-06T17:04:20.280381+00:00 controller.certificaterequest istio-system toolhub-wvz2q CertificateRequest has not been approved yet. Ignoring.
2022-02-06T17:08:06.328312+00:00 controller.certificaterequest istio-system toolhub-wvz2q Initialising Ready condition
2022-02-06T17:08:07.376063+00:00 controller.certificaterequest istio-system toolhub-l8xjm Initialising Ready condition
2022-02-06T17:08:07.441243+00:00 controller.certificaterequest istio-system toolhub-wvz2q Signing cert with k8s_staging discovery true
2022-02-06T17:08:08.047227+00:00 controller.certificaterequest istio-system toolhub-l8xjm Signing cert with k8s_staging discovery true
2022-02-06T17:08:08.536874+00:00 controller.certificaterequest istio-system toolhub-l8xjm CertificateRequest is Ready. Ignoring.
2022-02-06T17:08:08.536832+00:00 controller.certificaterequest istio-system toolhub-wvz2q CertificateRequest is Ready. Ignoring.

Expected behaviour:
I'd have expected that even if two CertificateRequests would have been created they would not be allowed to share the same cert-manager.io/certificate-revision. Also I would have expected some kind of error metric telling me that something was wrong and/or the certmanager_certificate_ready_status being False or Unknown.

Steps to reproduce the bug:
Unfortunately I don't know how to reproduce.

Anything else we need to know?:

Environment details::

  • Kubernetes version: 1.16
  • Cloud-provider/provisioner: n/a
  • cert-manager version: 1.5.4
  • Install method: helm

/kind bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions