Skip to content

Stale/Stuck Challenges should be deleted after a given timeout #7234

@InfoSec812

Description

@InfoSec812

Is your feature request related to a problem? Please describe.
We recently had someone change a role in AWS IAM and it stopped DNS01 challenges via Route53 from working correctly with the following error repeating several times per second:

E0814 12:49:27.775393       1 sync.go:282] "error cleaning up challenge" err=<
        error instantiating route53 challenge solver: unable to assume role: AccessDenied: User: arn:aws:iam::XXXXXXXXXXXX:user/REDACTED is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::XXXXXXXXXXXX:role/REDACTED
                status code: 403, request id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
 > logger="cert-manager.challenges.finalizer" resource_name="REDACTED" resource_namespace="REDACTED" resource_kind="Challenge" resource_version="v1" dnsName="REDACTED" type="DNS-01"

Describe the solution you'd like
If a Challenge fails it should be deleted after a definable timeout and allow the operator to recreate the Challenge (which fixed our issue because the ambient credentials annotation had been updated on the service account)

Describe alternatives you've considered
Challenges could also potentially not store the role ARN and look it up on each run

Additional context

  1. An AWS role was changed making ambient credentials stop working
  2. Challenges were created which were stuck failing over and over
  3. The ambient credentials annotation was fixed and the role restored
  4. Existing challenges continued to fail until they were manually deleted and allowed to be recreated

Environment details (remove if not applicable):

  • Kubernetes version: 1.28.9
  • Cloud-provider/provisioner: AWS/OpenShift
  • cert-manager version: v1.1.0
  • Install method: Operator Lifecycle Manager (OLM)

/kind feature

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions