-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Closed
Copy link
Labels
Description
Description
When using a remote snapshotter (or any other snapshotter that doesn't place snapshots under the containerd root directory), ephemeral storage limits are not enforced by the kubelet. The container can blow past its limits and keep running indefinitely.
The kublet logs show errors like:
kubelet[3094]: E0419 15:57:23.046299 3094 cri_stats_provider.go:448] "Failed toget the info of the filesystem with mountpoint" err="failed to get device for dir \"/var/lib/containerd/io.containerd.snapshotter.v1.soci\": stat failed on /var/lib/containerd/io.containerd.snapshotter.v1.soci with error: no such file or directory" mountpoint="/var/lib/containerd/io.containerd.snapshotter.v1.soci"
and
kubelet[3094]: E0419 15:56:55.022396 3094 kubelet.go:1436] "Image garbage collection failed multiple times in a row" err="invalid capacity 0 on image filesystem"
It looks like the kublet is unable to run ephemeral storage checks and image garbage collection because it's looking for image filesystem information in the wrong place.
Steps to reproduce the issue
- Configure containerd to use a remote snapshotter in a k8s environment
- Create a pod with an ephemeral storage limit:
resources:
limits:
ephemeral-storage: 20M
requests:
ephemeral-storage: 10M
- Exec into the container and allocate more disk space than allowed
# fallocate -l 1G test1
- Observe that the pod does not get evicted and the kubelet logs show errors above
Describe the results you received and expected
The pod should be evicted and the kubelet logs should not show erorrs
What version of containerd are you using?
containerd github.com/containerd/containerd 1.7.11 64b8a81
Any other relevant information
Related downstream issue awslabs/soci-snapshotter#1093
Show configuration if it is related to CRI plugin.
$ cat /etc/containerd/config.toml
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
[grpc]
address = "/run/containerd/containerd.sock"
[proxy_plugins.soci]
type = "snapshot"
address = "/run/soci-snapshotter-grpc/soci-snapshotter-grpc.sock"
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
discard_unpacked_layers = true
snapshotter = "soci"
# This line is required for containerd to send information about how to lazily load the image to the snapshotter
disable_snapshot_annotations = false
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"