Skip to content

Conversation

djdongjin
Copy link
Member

@djdongjin djdongjin commented Nov 28, 2024

Fix #11001

When the shim creates a container within NewContainer, the cgroups v1/v2 may be nil due to errors.

var cg interface{}
if cgroups.Mode() == cgroups.Unified {
g, err := cgroupsv2.PidGroupPath(pid)
if err != nil {
log.G(ctx).WithError(err).Errorf("loading cgroup2 for %d", pid)
return container, nil
}
cg, err = cgroupsv2.Load(g)
if err != nil {
log.G(ctx).WithError(err).Errorf("loading cgroup2 for %d", pid)
}
} else {
cg, err = cgroup1.Load(cgroup1.PidPath(pid))
if err != nil {
log.G(ctx).WithError(err).Errorf("loading cgroup for %d", pid)
}
}
container.cgroup = cg

However, a nil pointer of *cgroupsv2.Manager type is not an nil interface{}. And it can be converted back to *cgroupsv2.Manager type (the resulted var will be nil) which causes nil deference panic in Start in the above issue. (sample: https://go.dev/play/p/ZThwvj1n3g8)

case *cgroupsv2.Manager:
allControllers, err := cg.RootControllers()

The fix is to return directly on error (to avoid container.cgroup = cg ), so the switch in Start will match neither cgroup1.Cgroup (interface) nor *cgroupsv2.Manager.

@djdongjin
Copy link
Member Author

djdongjin commented Nov 28, 2024

Notice the change only fixes/avoid the nil deference panic issue. It seems in current shim implementation, cgroups related errors in are ignored (only logged) and don't cause task Create/Start to fail in shim. Is this expected?

Signed-off-by: Jin Dong <djdongjin95@gmail.com>
@djdongjin
Copy link
Member Author

@AkihiroSuda could you PTAL? I realized the same issue exists in two places (NewContainer and Container. Start). thanks

@AkihiroSuda AkihiroSuda added cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch cherry-pick/2.0.x Change to be cherry picked to release/2.0 branch labels Dec 3, 2024
@AkihiroSuda AkihiroSuda added this pull request to the merge queue Dec 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 4, 2024
@djdongjin
Copy link
Member Author

hi @AkihiroSuda could you please re-add this to merge queue? thanks.

Last time it was moved out due to criu install/download failures.
https://github.com/containerd/containerd/actions/runs/12152804593/job/33889965206
https://github.com/containerd/containerd/actions/runs/12152804593/job/33889966087

@AkihiroSuda AkihiroSuda added this pull request to the merge queue Dec 5, 2024
Merged via the queue into containerd:main with commit 5855c4d Dec 5, 2024
58 checks passed
@djdongjin
Copy link
Member Author

/cherrypick release/2.0

@k8s-infra-cherrypick-robot

@djdongjin: new pull request created: #11098

In response to this:

/cherrypick release/2.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@djdongjin
Copy link
Member Author

/cherrypick release/1.7

@djdongjin
Copy link
Member Author

/cherrypick release/1.6

@k8s-infra-cherrypick-robot

@djdongjin: #11069 failed to apply on top of branch "release/1.7":

Applying: fix panic due to nil dereference cgroups v2
Using index info to reconstruct a base tree...
A	cmd/containerd-shim-runc-v2/runc/container.go
Falling back to patching base and 3-way merge...
Auto-merging runtime/v2/runc/container.go
CONFLICT (content): Merge conflict in runtime/v2/runc/container.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 fix panic due to nil dereference cgroups v2

In response to this:

/cherrypick release/1.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-infra-cherrypick-robot

@djdongjin: #11069 failed to apply on top of branch "release/1.6":

Applying: fix panic due to nil dereference cgroups v2
Using index info to reconstruct a base tree...
A	cmd/containerd-shim-runc-v2/runc/container.go
Falling back to patching base and 3-way merge...
Auto-merging runtime/v2/runc/container.go
CONFLICT (content): Merge conflict in runtime/v2/runc/container.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 fix panic due to nil dereference cgroups v2

In response to this:

/cherrypick release/1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dmcgowan dmcgowan added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch cherry-picked/2.0.x PR commits are cherry picked into the release/2.0 branch and removed cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch cherry-pick/2.0.x Change to be cherry picked to release/2.0 branch labels Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Runtime cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch cherry-picked/2.0.x PR commits are cherry picked into the release/2.0 branch kind/bug size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroups: invalid group path leads to a segmentation violation
6 participants