-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
EDIT: We decided to go with the more aggressive timeline, adding a beta-level feature flag to cert-manager in v1.18 which can be disabled. A future release - likely v1.19 - will make the flag GA and stop it from being disabled.
Flynn from Buoyant reminded me about the default value for PrivateKeyRotationPolicy
being Never
and how unintuitive that can be for users.
This default can be harmful to say the least. We document that there's a footgun but that only goes so far.
Flynn's concerns have convinced me that we should move forwards with fixing this. Flynn advocated for a major version bump (cert-manager v2); I think our use of semver hasn't ever been literal and we've made bigger changes in minor releases before, though. Talking about v2 opens a big can of worms around other breaking changes which would distract from this specific issue, so I'm proposing a change in a minor release.
Why change?
Security; if a private key is exposed, users may (reasonably) assume that re-issuing a certificate (e.g. using cmctl renew
) will generate a new private key. Flynn's point is that unless a user has already deeply read the cert-manager docs (which most won't have, it's pretty simple to get cert-manager working for most) this is highly surprising and will mean that cert-manager isn't protecting the user from something they expect.
Users who need to not change the private key on rotation (e.g. if they want to preserve it and just change the cert, which is less common but still happens) would need to explicitly set the rotation policy to never to avoid losing the key.
Proposal
I think we should change the default to Always
ASAP and provide a short-term option for people to maintain the old behavior. I propose two options for timelines below.
Both timelines use a feature gate to make the change. A feature gate gives users the ability to "opt in" to the old behaviour until we eventually promote the gate to GA (giving them time to add rotationPolicy: Never
to Certificate
s which need it).
I also propose backporting a warning to both v1.16 and v1.17 for any Certificate
which doesn't explicitly set rotationPolicy
, warning that the default will change.
Timeline 1: Add to cert-manager v1.18 at beta level
Quickest: just change the default in v1.18. I think we've made bigger "breaking" changes than this in past minor releases, so we could justify making the change without giving people months to prepare.
This is equivalent to introducing the feature gate at "beta" level in v1.18.
Timeline 2: Alpha feature gate in v1.18, beta in v1.19
If we think making the change in v1.18 is too aggressive, we could add an alpha feature gate in v1.18, along with the new warning mentioned above about the default changing.
We'd then promote that feature gate to beta in v1.19, enabling it by default but keeping the ability to revert to the old behavior.
This is several months slower than timeline one, and I don't personally feel that we need to be this cautious, but I think it's absolutely a reasonable position to take.
Moving Forwards
My default position is timeline 1: adding the feature gate at "beta" level. We can add big warnings in the release notes for v1.18 showing people how to un-set the feature gate if needed, and I feel that's enough.
I'd like to reach consensus on this quickly so this isn't hanging over us for months. I'm happy to do the work. I'll start a lazy consensus on this issue in slack, and unless a plurality emerges in favour of timeline 2 I'll implement timeline 1 by the end of the lazy consensus period.