Skip to content

kubelet fails to start pods after upgrade to 1.31.X #127316

@JumpMaster

Description

@JumpMaster

What happened?

After the first controller upgrade from 1.30.4 to 1.31.0 and later 1.31.1. The kubelet service fails to start the majority of pods running on the controller.

What did you expect to happen?

All pods should start as before the upgrade.

$ kubectl get pod -o wide --all-namespaces -w | grep controller2
calico-system     calico-node-p4fzm                                          0/1     CreateContainerConfigError   0              153m    192.168.16.112   kubecontroller2   <none>           <none>
calico-system     calico-typha-685957f77b-k2xf8                              0/1     CreateContainerConfigError   4 (122m ago)   40d     192.168.16.112   kubecontroller2   <none>           <none>
calico-system     csi-node-driver-b6jkc                                      0/2     CreateContainerConfigError   8 (122m ago)   40d     10.245.98.170    kubecontroller2   <none>           <none>
default           netdata-child-pctf4                                        0/2     CreateContainerConfigError   2 (122m ago)   11d     192.168.16.112   kubecontroller2   <none>           <none>
kube-system       etcd-kubecontroller2                                       1/1     Running                      1 (122m ago)   122m    192.168.16.112   kubecontroller2   <none>           <none>
kube-system       kube-apiserver-kubecontroller2                             1/1     Running                      1 (122m ago)   122m    192.168.16.112   kubecontroller2   <none>           <none>
kube-system       kube-controller-manager-kubecontroller2                    1/1     Running                      1 (122m ago)   122m    192.168.16.112   kubecontroller2   <none>           <none>
kube-system       kube-proxy-q2drw                                           0/1     CreateContainerConfigError   0              17h     192.168.16.112   kubecontroller2   <none>           <none>
kube-system       kube-scheduler-kubecontroller2                             1/1     Running                      1 (122m ago)   122m    192.168.16.112   kubecontroller2   <none>           <none>
kube-system       kube-vip-kubecontroller2                                   1/1     Running                      1 (122m ago)   122m    192.168.16.112   kubecontroller2   <none>           <none>
monitoring        kube-prometheus-stack-prometheus-node-exporter-gnwsj       0/1     CreateContainerConfigError   1 (122m ago)   5d1h    192.168.16.112   kubecontroller2   <none>           <none>

How can we reproduce it (as minimally and precisely as possible)?

Basic Debian 12 VM used as a controller node. First controller upgraded from 1.30.4 to 1.31.0 and later to 1.31.1.

Anything else we need to know?

Reinstalling kubelet 1.30.4 allows pods to start correctly.

Kubernetes version

$ kubectl version
Client Version: v1.30.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4

Cloud provider

Self hosted / metal

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

$ uname -a
Linux kubecontroller2 6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3 (2024-08-26) x86_64 GNU/Linux

Install tools

kubeadm

Container runtime (CRI) and version (if applicable)

containerd containerd.io 1.7.22 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c

Related plugins (CNI, CSI, ...) and versions (if applicable)

calico v3.28.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.kind/supportCategorizes issue or PR as a support question.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    Status

    Needs Information

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions