-
Notifications
You must be signed in to change notification settings - Fork 323
[release/0.3] Do not abort image-pull in Extracting phase #373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/0.3] Do not abort image-pull in Extracting phase #373
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This addresses the issue with too big of a hammer. Pulling an image has two phases: downloading and extracting. The existing code implements a watchdog timer which gets reset whenever the image pull makes progress. That's a fantastic feature to have; aborting the operation if it is not making progress. We should keep that!
The root cause is that the extracting phase of image pull is not considered to be making progress by the watchdog timer, because the daemon does not provide progress updates during the extraction phase. Could we simply disable the watchdog timer during the extraction phase, while leaving it active for the download phase? As far as I can tell, the extraction phase can be identified with progress.message.Status == "Extracting"
.
Please merge to the development branch (master) first, then open backports to release branches. It's too easy to forget to update the development branch if done the other way around, leading to regressions upon upgrade. |
Signed-off-by: Xinfeng Liu <XinfengLiu@icloud.com>
2b089c4
to
73bb6ab
Compare
The `E2E Tests (test-executor, v1.28.13+k3s1, minimal, false)` test has been flaky for awhile and keeps failing with the error `ErrImageNeverPull: Container image "quay.io/argoproj/argocli:latest" is not present with pull policy of Never. This shouldn't be happening because k3s should be using cri-dockerd as the container runtime and the "Load images" step handles loading that image into Docker. There were changes to cri-dockerd recently (Mirantis/cri-dockerd#373) that might be related, but it's impossible to tell without the logs. Signed-off-by: Mason Malone <mmalone@adobe.com>
Do not abort image-pull
when progress deadline (1 minute) is reached.duringExtracting
phase.Some large images need more than 1 minute to extract.
Fixes #372
Proposed Changes
Do not abort image-pull when progress deadline (1 minute) is reached. Log a warning instead.testing
Use the testing step in #372 with this PR build. Now image-pull succeeds.