Skip to content

cainjector leader election fails due to lease issues #6646

@GeertvanHorrik

Description

@GeertvanHorrik

I am trying to install cert-manager for while on a (private) AKS cluster, but cannot get it to work.

Symptoms

The first symptoms are the startup validation fails:

Post "https://<company_name>-cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s":
      x509: certificate signed by unknown authority)

Steps to repro

  1. Provision kubernetes cluster using Terraform (tried using 1.26.x and 1.27.x)

This is a private cluster and it has an Application Gateway in front, but that should not matter for this issue.

  1. Install cert-manager using this helm script (CRDs are installed separately):

(as you might notice, I already found some other potential solutions using leaderElection)

helm upgrade --install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.13.3 \
  --set image.repository=some-custom-domain.azurecr.io/jetstack/cert-manager-controller \
  --set webhook.image.repository=some-custom-domain.azurecr.io/jetstack/cert-manager-webhook \
  --set cainjector.image.repository=some-custom-domain.azurecr.io/jetstack/cert-manager-controller \
  --set acmesolver.image.repository=some-custom-domain.azurecr.io/jetstack/cert-manager-acmesolver \
  --set startupapicheck.image.repository=some-custom-domain.azurecr.io/jetstack/cert-manager-ctl \
  --set global.leaderElection.namespace=cert-manager \
  --debug

Investigation

  1. Somehow the cainjector is not able to get the leases for leader election:
kubectl logs cert-manager-... -n cert-manager

shows

error retrieving resource lock cert-manager/cert-manager-controller: leases.coordination.k8s.io 
"cert-manager-controller" is forbidden: User "system:serviceaccount:cert-manager:cert-manager-cainjector" 
cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "cert-manager"

Extra steps to verify this behavior:

kubectl auth can-i get leases --as system:serviceaccount:cert-manager:cert-manager-cainjector --namespace cert-manager

returns no

  1. Role bindings look good (this is untouched, straight from the helm chart):
KIND                 NAMESPACE       NAME                                                   SERVICE_ACCOUNTS
RoleBinding          cert-manager    cert-manager-cainjector:leaderelection                 cert-manager-cainjector
RoleBinding          cert-manager    cert-manager-startupapicheck:create-cert               cert-manager-startupapicheck
RoleBinding          cert-manager    cert-manager-webhook:dynamic-serving                   cert-manager-webhook
RoleBinding          cert-manager    cert-manager:leaderelection                            cert-manager

I tried re-applying the role, fully recreating the cluster, but can't find the cause.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions