-
Notifications
You must be signed in to change notification settings - Fork 2.2k
criu: Add time namespace to container config after checkpoint/restore #4696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Maybe makes sense to add "Since v3.14, CRIU always restores ..." |
Since v3.14, CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks. This change updates the container configuration to ensure that `runc exec` launches new processes within the container's time namespace. Fixes opencontainers#2610 Signed-off-by: Andrei Vagin <avagin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@adrianreber @lifubang @rata PTAL |
Do you know whether this is a long term solution or not in CRIU. It seems that criu set a wrong time offset config for the restored process.
Test error msg:
I don't know whether this is the expected result or not. |
It is a long-term solution. The time namespace was designed for the C/R purpose, and there is no way to restore processes without using time namespaces.
CRIU sets offsets so that the monotonic and boottime clocks continue ticking from the moment when a container was dumped. The monotonic and boottime clocks cannot jump backward. When a container is restored on another machine or on the same machine after a reboot, the "native" clocks can have any values, and we need to adjust them according to their values when the processes were dumped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM but I'd like for @lifubang to nail down the issue he sees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Left some comments
fi | ||
|
||
# exec a new background process. | ||
runc exec test_busybox sh -c 'sleep 1000 < /dev/null &> /dev/null & echo $!' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have the dev/null redirections here? I guess there is a criu thing that is better with that? Can you please elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in simple_cr
from where the code is copied. The issue here is we need sleep
process to be checkpointable, and if some of its file descriptors point to files/pipes on the host, it is not checkpointable (as the process is not contained in a container).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't see this null redirections in checkpoints.bat. Also, runc does a reopen of /dev/null
, so I guess that shouldn't be the issue?
But yeah, removing those redirections locally makes the test get stalled. I guess this is fine if criu devs think it is needed. Although some explanation on why that is the case would be great, IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sleep process is daemonized, and runsc exec
exits right after forking it. If we don't redirect its file descriptors, they will be closed when runsc exec exits, and the sleep process can be killed by SIGHUP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Although some more explanations in the tests would be great, as we might need to debug them later.
@lifubang PTAL to see if your concern is addressed
I think it's reasonable for CRIU to set the restored processes's time offset. |
@lifubang what do you mean exactly? That the container process and the exec'ed process should not use the same time namespace? |
Yes, I think because the original |
@lifubang but having a different view in exec than the container seems very confusing. And debugging anything might be very hard (if it's timens related) |
Yes, so I have no strong opinion to change. Feel free to merge this PR. |
@lifubang merging then. We can revert or fix if you find any weird issue, as you usually do. I hope not in this case, though :) |
1.3 backport: #4705 |
Looking at the upstream issue, it was reported with 1.2. So we should backport it there too. |
I'm not sure if we'll ever do 1.2.7 but why not. 1.2 backport: #4714 |
CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks. This change updates the container config to ensure that
runc exec
launches new processes within the container's time namespace.Fixes checkpoint-restore/criu#2610