-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Description of problem
In working to enable GID policy checks in this PR #11077 I have found that the group ID chosen for the running user in the same container, quay.io/opstree/redis, differs when using a standard (ubuntu/mariner/etc, clh) CI cluster versus the (qemu-coco-dev, nydus) CI clusters.
This seems like a problem with nydus, as one should expect the chosen groups to be consistent for an image deployed in any given cluster.
I can confirm this locally with two clusters, each deployed with the k8s-policy-deployment.yaml:
AZ_RG="${my_resource_group}" \
DOCKER_REGISTRY=ghcr.io \
DOCKER_REPO=kata-containers/kata-deploy-ci \
DOCKER_TAG="${DOCKER_TAG}" \
GH_PR_NUMBER="${GH_PR_NUMBER}" \
KATA_HYPERVISOR=clh \
KBS_INGRESS=aks \
KUBERNETES=vanilla \
USING_NFD=false \
bash gha-run.sh create-cluster
In the first cluster, one can exec into the deployed redis container and see the id output 1000:1000
AZ_RG="${my_resource_group}" \
DOCKER_REGISTRY=ghcr.io \
DOCKER_REPO=kata-containers/kata-deploy-ci \
DOCKER_TAG="${DOCKER_TAG}" \
GH_PR_NUMBER="${GH_PR_NUMBER}" \
KBS_INGRESS=aks \
KUBERNETES=vanilla \
USING_NFD=false \
KATA_HYPERVISOR=qemu-coco-dev \
PULL_TYPE=guest-pull \
KBS=false\
SNAPSHOTTER=nydus \
bash gha-run.sh create-cluster
With this deployment, the same redis container has id output 1000:0
Expected result
I would expect the group id for this redis container in all cases to be 1000:1000. This is because the /etc/passwd file for that container is as follows:
root:x:0:0:root:/root:/bin/ash\nbin:x:1:1:bin:/bin:/sbin/nologin
...
**redis:x:1000:1000:Linux User,,,:/home/redis:/sbin/nologin**
Further information
I have tried to make another clh cluster with nydus enabled, but these configurations don't seem to work out of the box. I have managed to deploy the clh cluster with PULL_TYPE=guest-pull, and saw that the erroneous GID behavior did not manifest.
Since Linux gets the GID for the user from /etc/passwd, I suspect that maybe Nydus' lazy image layer pulling feature isn't getting /etc/passwd from the layers at the right time, which would cause the GID to be 0.