Skip to content

Conversation

Apokleos
Copy link
Contributor

Fixes #10692

@katacontainersbot katacontainersbot added the size/medium Average sized task label Dec 25, 2024
@Apokleos Apokleos self-assigned this Dec 25, 2024
@Apokleos Apokleos marked this pull request as ready for review December 30, 2024 12:10
Comment on lines 583 to 709
# Agent request timeout value in millisecond

# This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest.
# It's also used to ensure that workloads, especially those involving large image pulls within the guest,
# have sufficient time to complete.

# Effective Timeout Determination:
# The effective timeout for a CreateContainerRequest is determined by taking the minimum of the following two values:
# - request_timeout_ms: The timeout value configured for the agent (default: 30,000 milliseconds).
# - runtime-request-timeout: The timeout value specified in the Kubelet configuration described as the link below:
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout)

# default timeout: 30_000 ms

# request_timeout_ms = 30000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm interested in how you view this approach vs the go runtime's use of create_container_timeout we have in the config:

# Indicates the CreateContainer request timeout needed for the workload(s)
# It using guest_pull this includes the time to pull the image inside the guest
# Defaults to @DEFCREATECONTAINERTIMEOUT@ second(s)
# Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout.
# In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout.
create_container_timeout = @DEFCREATECONTAINERTIMEOUT@

and docs:
| `io.katacontainers.config.runtime.create_container_timeout` | `uint64` | the timeout for create a container in `seconds`, default is `60` |

I'm not sure if there is an official policy, but it looks like the rest of the config is pretty aligned between the go and rust runtimes, so we might want to keep that here too for ease of user experience and doc consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx Steve @stevenhorsman good cache ! In fact, I'm not sure if I should make it align with the config in runtime-go. Or I prefer different voices about it.

IMO:
(1) using request_timeout_ms: But I know the config request_timeout_ms is already existing in runtime-rs from kata 3.0 release. Just based on this, I use the request_timeout_ms for setting timeout in the case of large image pulling in guest. Some difference of configurations of runtime-rs from runtime-go makes sense for me.
(2) using create_container_timeout instead of request_timeout_ms: Making it align with the config of runtime-go which reduces doc inconsistency and gives us ease of user experience, also makes sense for me.

If I have to make it change, based on case (2), we'll need to map create_container_timeout to request_timeout_ms. Users will input create_container_timeout via configuration and annotations, but internally, the runtime-rs expects request_timeout_ms which create_container_timeout will be converted to create_container_timeout internally. What do you think about this approach ? or any other ideas ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra information. I don't think there is a best solution and either have good and bad points. If we are going to accept there are differences in behaviour between the go and rust runtimes, then I think this might be a good opportunity to work out the best design for the long-term and adopt that behaviour. e.g. I'm trying to work out if the isolation of having a specific create_container_timeout request is helpful as you might want to give a 300s timeout to pull a large pull, but probably don't want to wait that long for an exec process request (as an example), so is there a benefit to having create_container timeout that is independent from the general request_timeout_ms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenhorsman By default in k8s, the timeout for creating container request is set to 120s. The reason the default value for create_container_timeout is 60s is that defaultRequestTimeout is configured to 60 seconds in go runtime. I recommend setting the default value for create_container_timeout or request_timeout_ms to not exceed 120s.

@Apokleos Apokleos force-pushed the guest-pullimage-timeout branch from 8fddad8 to 47e98ea Compare June 16, 2025 02:04
@Apokleos Apokleos force-pushed the guest-pullimage-timeout branch 2 times, most recently from 406596b to e2825a9 Compare June 23, 2025 08:07
@Apokleos Apokleos requested a review from stevenhorsman June 23, 2025 08:08
@Apokleos Apokleos force-pushed the guest-pullimage-timeout branch from e2825a9 to c75c668 Compare June 23, 2025 08:15
Apokleos added 4 commits June 30, 2025 20:04
To better understand the impact of different timeout values on system
behavior, this section provides a more comprehensive explanation of the
request_timeout_ms:

This timeout value is used to set the maximum duration for the agent to
process a CreateContainerRequest. It's also used to ensure that workloads,
especially those involving large image pulls within the guest, have sufficient
time to complete.
Based on explaination above, it's renamed with `create_container_timeout`,
Specially, exposed in 'configuration.toml'

Fixes kata-containers#10692

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
It allows users to set this create container timeout within
configuration.toml according to the size of image to be pulled
inside guest.

Fixes kata-containers#10692

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
It's used to indicate timeout value set for image pulling in
guest during creating container.
This allows users to set this timeout with annotation according to the
size of image to be pulled.

Fixes kata-containers#10692

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Add the definiation of variable DEFCREATECONTAINERTIMEOUT into
Makefile target with default timeout 30s.

Fixes: kata-containers#485

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
@Apokleos Apokleos force-pushed the guest-pullimage-timeout branch from 9c27d60 to 9839c17 Compare June 30, 2025 12:05
@Apokleos Apokleos merged commit 8784ceb into kata-containers:main Jul 1, 2025
508 of 533 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/medium Average sized task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime-rs: support setting create_container timeout with request_timeout_ms for image pulling in guest
5 participants