Skip to content

Defaulting of kubelet config not taken into consideration while calculating the worker pool hash #11416

@shafeeqes

Description

@shafeeqes

How to categorize this issue?

/area quality
/kind bug

What happened:
The list of changes which will cause a rolling update of the worker nodes are listed here:

#### Rolling Update Triggers
Apart from the above mentioned triggers, a rolling update of the shoot worker nodes is also triggered for some changes to your worker pool specification (`.spec.provider.workers[]`, even if you don't change the Kubernetes or machine image version).
The complete list of fields that trigger a rolling update:
* `.spec.kubernetes.version` (except for patch version changes)
* `.spec.provider.workers[].machine.image.name`
* `.spec.provider.workers[].machine.image.version`
* `.spec.provider.workers[].machine.type`
* `.spec.provider.workers[].volume.type`
* `.spec.provider.workers[].volume.size`
* `.spec.provider.workers[].providerConfig` (except if feature gate `NewWorkerPoolHash`)
* `.spec.provider.workers[].cri.name`
* `.spec.provider.workers[].kubernetes.version` (except for patch version changes)
* `.spec.systemComponents.nodeLocalDNS.enabled`
* `.status.credentials.rotation.certificateAuthorities.lastInitiationTime` (changed by Gardener when a shoot CA rotation is initiated) when worker pool is not part of `.status.credentials.rotation.certificateAuthorities.pendingWorkersRollouts[]`
* `.status.credentials.rotation.serviceAccountKey.lastInitiationTime` (changed by Gardener when a shoot service account signing key rotation is initiated) when worker pool is not part of `.status.credentials.rotation.serviceAccountKey.pendingWorkersRollouts[]`
If feature gate `NewWorkerPoolHash` is enabled:
* `.spec.kubernetes.kubelet.kubeReserved` (unless a worker pool-specific value is set)
* `.spec.kubernetes.kubelet.systemReserved` (unless a worker pool-specific value is set)
* `.spec.kubernetes.kubelet.evictionHard` (unless a worker pool-specific value is set)
* `.spec.kubernetes.kubelet.cpuManagerPolicy` (unless a worker pool-specific value is set)
* `.spec.provider.workers[].kubernetes.kubelet.kubeReserved`
* `.spec.provider.workers[].kubernetes.kubelet.systemReserved`
* `.spec.provider.workers[].kubernetes.kubelet.evictionHard`
* `.spec.provider.workers[].kubernetes.kubelet.cpuManagerPolicy`
Changes to `kubeReserved` or `systemReserved` do not trigger a node roll if their sum does not change.
Generally, the provider extension controllers might have additional constraints for changes leading to rolling updates, so please consult the respective documentation as well.
In particular, if the feature gate `NewWorkerPoolHash` is enabled and a worker pool uses the new hash, then the `providerConfig` as a whole is not included. Instead only fields selected by the provider extension are considered.

Based on this a hash is calculated,

// KeyV2 returns the key that can be used as secret name based on the provided worker name,
// Kubernetes version, machine type, image, worker volume, CRI, credentials rotation, node local dns
// and kubelet configuration.
func KeyV2(
kubernetesVersion *semver.Version,
credentialsRotation *gardencorev1beta1.ShootCredentialsRotation,
worker *gardencorev1beta1.Worker,
nodeLocalDNSEnabled bool,
kubeletConfiguration *gardencorev1beta1.KubeletConfig,
) string {
if kubernetesVersion == nil {
return ""
}
kubernetesMajorMinorVersion := fmt.Sprintf("%d.%d", kubernetesVersion.Major(), kubernetesVersion.Minor())
data := []string{
kubernetesMajorMinorVersion,
worker.Machine.Type,
worker.Machine.Image.Name + *worker.Machine.Image.Version,
}
if worker.Volume != nil {
data = append(data, worker.Volume.VolumeSize)
if worker.Volume.Type != nil {
data = append(data, *worker.Volume.Type)
}
}
if worker.CRI != nil {
data = append(data, string(worker.CRI.Name))
}
if credentialsRotation != nil {
if credentialsRotation.CertificateAuthorities != nil {
if lastInitiationTime := v1beta1helper.LastInitiationTimeForWorkerPool(worker.Name, credentialsRotation.CertificateAuthorities.PendingWorkersRollouts, credentialsRotation.CertificateAuthorities.LastInitiationTime); lastInitiationTime != nil {
data = append(data, lastInitiationTime.Time.String())
}
}
if credentialsRotation.ServiceAccountKey != nil {
if lastInitiationTime := v1beta1helper.LastInitiationTimeForWorkerPool(worker.Name, credentialsRotation.ServiceAccountKey.PendingWorkersRollouts, credentialsRotation.ServiceAccountKey.LastInitiationTime); lastInitiationTime != nil {
data = append(data, lastInitiationTime.Time.String())
}
}
}
if nodeLocalDNSEnabled {
data = append(data, "node-local-dns")
}
if kubeletConfiguration != nil {
if resources := v1beta1helper.SumResourceReservations(kubeletConfiguration.KubeReserved, kubeletConfiguration.SystemReserved); resources != nil {
data = append(data, fmt.Sprintf("%s-%s-%s-%s", resources.CPU, resources.Memory, resources.PID, resources.EphemeralStorage))
}
if eviction := kubeletConfiguration.EvictionHard; eviction != nil {
data = append(data, fmt.Sprintf("%s-%s-%s-%s-%s",
ptr.Deref(eviction.ImageFSAvailable, ""),
ptr.Deref(eviction.ImageFSInodesFree, ""),
ptr.Deref(eviction.MemoryAvailable, ""),
ptr.Deref(eviction.NodeFSAvailable, ""),
ptr.Deref(eviction.NodeFSInodesFree, ""),
))
}
if policy := kubeletConfiguration.CPUManagerPolicy; policy != nil {
data = append(data, *policy)
}
}
var result string
for _, v := range data {
result += utils.ComputeSHA256Hex([]byte(v))
}
return fmt.Sprintf("gardener-node-agent-%s-%s", worker.Name, utils.ComputeSHA256Hex([]byte(result))[:16])
}

Some fields in the kubelet config are considered in the hash.
But while creating the file content for the Kubelet config, there are some defaults applied to the config,

func setConfigDefaults(c *components.ConfigurableKubeletConfigParameters, kubernetesVersion *semver.Version) {
if c.CpuCFSQuota == nil {
c.CpuCFSQuota = ptr.To(true)
}
if c.CpuManagerPolicy == nil {
c.CpuManagerPolicy = ptr.To(kubeletconfigv1beta1.NoneTopologyManagerPolicy)
}
if c.EvictionHard == nil {
c.EvictionHard = make(map[string]string, 5)
}
for k, v := range evictionHardDefaults {
if c.EvictionHard[k] == "" {
c.EvictionHard[k] = v
}
}
if c.EvictionSoft == nil {
c.EvictionSoft = make(map[string]string, 5)
}
for k, v := range evictionSoftDefaults {
if c.EvictionSoft[k] == "" {
c.EvictionSoft[k] = v
}
}
if c.EvictionSoftGracePeriod == nil {
c.EvictionSoftGracePeriod = make(map[string]string, 5)
}
for k, v := range evictionSoftGracePeriodDefaults {
if c.EvictionSoftGracePeriod[k] == "" {
c.EvictionSoftGracePeriod[k] = v
}
}
if c.EvictionMinimumReclaim == nil {
c.EvictionMinimumReclaim = make(map[string]string, 5)
}
for k, v := range evictionMinimumReclaimDefaults {
if c.EvictionMinimumReclaim[k] == "" {
c.EvictionMinimumReclaim[k] = v
}
}
if c.EvictionPressureTransitionPeriod == nil {
c.EvictionPressureTransitionPeriod = &metav1.Duration{Duration: 4 * time.Minute}
}
if c.EvictionMaxPodGracePeriod == nil {
c.EvictionMaxPodGracePeriod = ptr.To[int32](90)
}
if c.FailSwapOn == nil {
c.FailSwapOn = ptr.To(true)
}
if c.ImageGCHighThresholdPercent == nil {
c.ImageGCHighThresholdPercent = ptr.To[int32](50)
}
if c.ImageGCLowThresholdPercent == nil {
c.ImageGCLowThresholdPercent = ptr.To[int32](40)
}
if c.SerializeImagePulls == nil {
c.SerializeImagePulls = ptr.To(true)
}
if c.KubeReserved == nil {
c.KubeReserved = make(map[string]string, 2)
}
for k, v := range kubeReservedDefaults {
if c.KubeReserved[k] == "" {
c.KubeReserved[k] = v
}
}
if c.MaxPods == nil {
c.MaxPods = ptr.To[int32](110)
}
if c.ContainerLogMaxSize == nil {
c.ContainerLogMaxSize = ptr.To("100Mi")
}
c.ProtectKernelDefaults = ptr.To(ShouldProtectKernelDefaultsBeEnabled(c, kubernetesVersion))
if c.StreamingConnectionIdleTimeout == nil {
if version.ConstraintK8sGreaterEqual126.Check(kubernetesVersion) {
c.StreamingConnectionIdleTimeout = &metav1.Duration{Duration: time.Minute * 5}
} else {
// this is also the kubernetes default
c.StreamingConnectionIdleTimeout = &metav1.Duration{Duration: time.Hour * 4}
}
}
}

These reflect neither in the Shoot nor in the hash. Now if this defaults are changed for some reason, this would be directly applied on the node without draining (which is why the rolling is present currently). This is also important in the context of #10219.

Maybe we could default the config at Shoot level rather than in the config file.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Gardener version:
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/qualityOutput qualification (tests, checks, scans, automation in general, etc.) relatedkind/bugBuglifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions