-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Fix the panic caused by the failure of RunPodSandbox #11588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @HirazawaUi. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Please sign-off your commit. |
Signed-off-by: BING HONGTAO <695097494plus@gmail.com>
Signed. |
/ok-to-test |
why not add check in handleSandboxTaskExit? why sb.Container = nil ? @HirazawaUi |
ref: #11588 (comment) |
I find will call handleSandboxTaskExit twice, I don't know why. It should be anoter bug. |
In the logs I’ve observed, every panic seems to stem from a call to HandleEvent -> handleSandboxTaskExit, where a panic occurs. Additionally, in all other callers of If you think there’s anything incorrect in my response, I’d greatly appreciate it if you could point it out. That said, I’m a bit puzzled as to why you opened another PR to address this issue, especially when there doesn’t seem to be a clear difference between the two PRs. |
I close my pr,it should be a nother bug |
Will this cover up other issues? In fact, this paragraph was moved from https://github.com/containerd/containerd/blob/v1.7.27/pkg/cri/server/sandbox_run.go#L196C1-L205C5
But after the move, If we make changes here, may similar problems actually occur in other unknown places? |
@estesp It seems some test job failed due to some unrelated reasons, preventing this PR from being successfully merged. Is there anything I can do at this point to help? Or should I wait for other maintainers to re-add it to merge queue? |
Why hasn't this been backported to the 2.0 branch yet? |
Thank you for the reminder, I’ll cherry-pick this to the 2.0 branch. |
Thanks. How about getting #11576 / b9ab7a3 merged and backported too? We've had this pulled into our fork for months, containerd is basically unusable without it for a large subset of our users. It looks like 2.0 hasn't seen a patch release tagged since 2.1 came out, is this intentional? @austinvazquez @estesp |
Definitely not intentional. We attempt to properly mark PRs for cherry-pick that are bug fixes, but we are definitely imperfect. I think having two 2.x lines of support has made things a bit more challenging lately, but hopefully we can make sure to get 2.0.x in better shape and cut a release soon. We tend to be driven by customer demand on minor/point releases, so if there is ever a time you feel like a release is warranted please open an issue or ping us on #containerd-dev Slack channel. |
Fix the panic caused by the failure of RunPodSandbox (cherry picked from commit 009d8be) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds backports of containerd/containerd#11588 and containerd/containerd#12053 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 0d48c8e) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
When the RunPodSandbox interface fails due to an upstream context cancellation or other reasons, HandleEvent encounters a nil pointer because the PodSandbox.Container field is nil, resulting in a panic.
I discovered this issue in the kubernetes/kubernetes repository.
PR kubernetes/kubernetes#127122 aims to address this by passing the context to the container runtime, resolving the issue where container creation and image pulling actions cannot be canceled.
Previously, the tests in this PR worked fine, but after the CI in the k/k repository switched to containerd 2.0, many tests in this PR started failing.
While investigating the error, I captured the following error in the containerd logs: