criu: Add time namespace to container config after checkpoint/restore #4696

avagin · 2025-03-25T20:15:15Z

CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks. This change updates the container config to ensure that runc exec launches new processes within the container's time namespace.

Fixes checkpoint-restore/criu#2610

tests/integration/checkpoint.bats

kolyshkin · 2025-03-26T02:01:04Z

CRIU always restores processes into a time namespace

Maybe makes sense to add "Since v3.14, CRIU always restores ..."

libcontainer/criu_linux.go

Since v3.14, CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks. This change updates the container configuration to ensure that `runc exec` launches new processes within the container's time namespace. Fixes opencontainers#2610 Signed-off-by: Andrei Vagin <avagin@gmail.com>

kolyshkin

LGTM

kolyshkin · 2025-03-26T21:19:35Z

@adrianreber @lifubang @rata PTAL

lifubang · 2025-03-27T09:41:40Z

CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks.

Do you know whether this is a long term solution or not in CRIU. It seems that criu set a wrong time offset config for the restored process.

diff --git a/tests/integration/checkpoint.bats b/tests/integration/checkpoint.bats
index 3db34061..85368a8a 100644
--- a/tests/integration/checkpoint.bats
+++ b/tests/integration/checkpoint.bats
@@ -474,5 +474,9 @@ function simple_cr() {
                runc exec test_busybox sh -c 'sleep 1000 < /dev/null &> /dev/null & echo $!'
                [ "$status" -eq 0 ]
                execed_pid=$output
+               runc exec test_busybox cat /proc/self/timens_offsets
+               [ "$status" -eq 0 ]
+               grep -E '^monotonic\s+0\s+0$' <<<"$output"
+               grep -E '^boottime\s+0\s+0$' <<<"$output"
        done
 }

Test error msg:

...
   runc exec test_busybox sh -c sleep 1000 < /dev/null &> /dev/null & echo $! (status=0):
   29
   runc exec test_busybox cat /proc/self/timens_offsets (status=0):
   monotonic          -1 955929992
   boottime           -1 955925201
   --- teardown ---

12 tests, 1 failure, 2 skipped

I don't know whether this is the expected result or not.

avagin · 2025-03-27T14:52:47Z

CRIU always restores processes into a time namespace to prevent backward jumps of monotonic and boottime clocks.

Do you know whether this is a long term solution or not in CRIU.

It is a long-term solution. The time namespace was designed for the C/R purpose, and there is no way to restore processes without using time namespaces.

It seems that criu set a wrong time offset config for the restored process.

CRIU sets offsets so that the monotonic and boottime clocks continue ticking from the moment when a container was dumped. The monotonic and boottime clocks cannot jump backward. When a container is restored on another machine or on the same machine after a reboot, the "native" clocks can have any values, and we need to adjust them according to their values when the processes were dumped.

rata

This LGTM but I'd like for @lifubang to nail down the issue he sees.

rata

Thanks for the PR! Left some comments

tests/integration/checkpoint.bats

rata · 2025-03-31T15:27:24Z

tests/integration/checkpoint.bats

+		fi
+
+		# exec a new background process.
+		runc exec test_busybox sh -c 'sleep 1000 < /dev/null &> /dev/null & echo $!'


Why do we have the dev/null redirections here? I guess there is a criu thing that is better with that? Can you please elaborate?

Same as in simple_cr from where the code is copied. The issue here is we need sleep process to be checkpointable, and if some of its file descriptors point to files/pipes on the host, it is not checkpointable (as the process is not contained in a container).

Hmm, I don't see this null redirections in checkpoints.bat. Also, runc does a reopen of /dev/null, so I guess that shouldn't be the issue?

But yeah, removing those redirections locally makes the test get stalled. I guess this is fine if criu devs think it is needed. Although some explanation on why that is the case would be great, IMHO.

This sleep process is daemonized, and runsc exec exits right after forking it. If we don't redirect its file descriptors, they will be closed when runsc exec exits, and the sleep process can be killed by SIGHUP.

tests/integration/checkpoint.bats

rata

LGTM. Although some more explanations in the tests would be great, as we might need to debug them later.

@lifubang PTAL to see if your concern is addressed

lifubang · 2025-04-01T10:05:14Z

PTAL to see if your concern is addressed

I think it's reasonable for CRIU to set the restored processes's time offset.
But after the container restored, maybe the new exec processes to this container shouldn't join the init's process's time ns? Because I think these new processes didn't need to know whether this container has been check pointed and restored for sometimes or not. But maybe I'm wrong.

rata · 2025-04-01T10:28:05Z

@lifubang what do you mean exactly? That the container process and the exec'ed process should not use the same time namespace?

lifubang · 2025-04-01T10:38:45Z

@lifubang what do you mean exactly? That the container process and the exec'ed process should not use the same time namespace?

Yes, I think because the original config.json may not contain time namespace when creating a container. So the new exec'ed process after restored should not have a different time offset too.

rata · 2025-04-01T11:07:53Z

@lifubang but having a different view in exec than the container seems very confusing. And debugging anything might be very hard (if it's timens related)

lifubang · 2025-04-01T15:32:30Z

@lifubang but having a different view in exec than the container seems very confusing. And debugging anything might be very hard (if it's timens related)

Yes, so I have no strong opinion to change. Feel free to merge this PR.

rata · 2025-04-01T17:54:28Z

@lifubang merging then. We can revert or fix if you find any weird issue, as you usually do. I hope not in this case, though :)

kolyshkin · 2025-04-02T00:25:15Z

1.3 backport: #4705

rata · 2025-04-07T12:33:56Z

Looking at the upstream issue, it was reported with 1.2. So we should backport it there too.

kolyshkin · 2025-04-08T00:11:09Z

Looking at the upstream issue, it was reported with 1.2. So we should backport it there too.

I'm not sure if we'll ever do 1.2.7 but why not.

1.2 backport: #4714

avagin force-pushed the criu-vs-exec branch from 21e62e6 to a45ac36 Compare March 25, 2025 20:24

kolyshkin reviewed Mar 26, 2025

View reviewed changes

tests/integration/checkpoint.bats Show resolved Hide resolved

kolyshkin reviewed Mar 26, 2025

View reviewed changes

tests/integration/checkpoint.bats Outdated Show resolved Hide resolved

kolyshkin force-pushed the criu-vs-exec branch from a45ac36 to 4b62df2 Compare March 26, 2025 02:01

kolyshkin reviewed Mar 26, 2025

View reviewed changes

libcontainer/criu_linux.go Outdated Show resolved Hide resolved

kolyshkin added area/checkpoint-restore backport/1.3-todo A PR in main branch which needs to be backported to release-1.3 labels Mar 26, 2025

avagin force-pushed the criu-vs-exec branch from 4b62df2 to 55ac122 Compare March 26, 2025 15:11

avagin force-pushed the criu-vs-exec branch from 55ac122 to b68cbdf Compare March 26, 2025 15:12

kolyshkin approved these changes Mar 26, 2025

View reviewed changes

kolyshkin added this to the 1.3.0-rc.2 milestone Mar 26, 2025

rata reviewed Mar 31, 2025

View reviewed changes

rata approved these changes Apr 1, 2025

View reviewed changes

rata merged commit c3a41d7 into opencontainers:main Apr 1, 2025
34 checks passed

avagin mentioned this pull request Apr 1, 2025

with runc, PID 1 gets put in the wrong time NS on restore checkpoint-restore/criu#2610

Closed

kolyshkin mentioned this pull request Apr 2, 2025

[1.3] criu: Add time namespace to container config after checkpoint/restore #4705

Merged

kolyshkin removed this from the 1.3.0-rc.2 milestone Apr 2, 2025

kolyshkin removed the backport/1.3-todo A PR in main branch which needs to be backported to release-1.3 label Apr 2, 2025

kolyshkin added the backport/1.3-done A PR in main branch which has been backported to release-1.3 label Apr 2, 2025

kolyshkin mentioned this pull request Apr 8, 2025

[1.2] criu: Add time namespace to container config after checkpoint/restore #4714

Merged

kolyshkin added the backport/1.2-done A PR in main branch which has been backported to release-1.2 label Apr 8, 2025

criu: Add time namespace to container config after checkpoint/restore #4696

criu: Add time namespace to container config after checkpoint/restore #4696

Uh oh!

Conversation

avagin commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kolyshkin commented Mar 26, 2025

Uh oh!

Uh oh!

kolyshkin left a comment

Choose a reason for hiding this comment

Uh oh!

kolyshkin commented Mar 26, 2025

Uh oh!

lifubang commented Mar 27, 2025

Uh oh!

avagin commented Mar 27, 2025

Uh oh!

rata left a comment

Choose a reason for hiding this comment

Uh oh!

rata left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rata Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

kolyshkin Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

rata Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

avagin Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rata left a comment

Choose a reason for hiding this comment

Uh oh!

lifubang commented Apr 1, 2025

Uh oh!

rata commented Apr 1, 2025

Uh oh!

lifubang commented Apr 1, 2025

Uh oh!

rata commented Apr 1, 2025

Uh oh!

lifubang commented Apr 1, 2025

Uh oh!

rata commented Apr 1, 2025

Uh oh!

Uh oh!

kolyshkin commented Apr 2, 2025

Uh oh!

rata commented Apr 7, 2025

Uh oh!

kolyshkin commented Apr 8, 2025

Uh oh!

Uh oh!

avagin commented Mar 25, 2025 •

edited

Loading