runtime-rs: Add GPU annotations for remote hypervisor #11474

Apokleos · 2025-06-26T09:16:35Z

We're introducing default_gpus and default_gpu_model as GPU annotations for kata VM configurations to improve instance selection on remote hypervisors.
By adding these annotations:

default_gpus: Allows us to specify the minimum number of GPUs a VM requires. This ensures that the remote hypervisor selects an instance with at least that many GPUs, preventing resource under-provisioning.
default_gpu_model: Lets us define the specific GPU model needed for the VM. This is crucial for workloads that depend on particular GPU architectures or features, ensuring compatibility and optimal performance.
Essentially, these new fields provide the remote hypervisor with the necessary intelligence to select the most appropriate instance for a given GPU VM.

Signed-off-by: alex.lyn alex.lyn@antgroup.com

To provide the remote hypervisor with the necessary intelligence to select the most appropriate instance for a given GPU instance, leading to better resource allocation, two fields `default_gpus` and `default_gpu_model` are introduced in `RemoteInfo`. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

Two annotations: `default_gpus and `default_gpu_model` as GPU annotations are introduced for Kata VM configurations to improve instance selection on remote hypervisors. By adding these annotations: (1) `default_gpus`: Allows users to specify the minimum number of GPUs a VM requires. This ensures that the remote hypervisor selects an instance with at least that many GPUs, preventing resource under-provisioning. (2) `default_gpu_model`: Lets users define the specific GPU model needed for the VM. This is crucial for workloads that depend on particular GPU archs or features, ensuring compatibility and optimal performance. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

Add GPU specific annotations used by remote hypervisor for instance selection during `prepare_vm`. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

Enable GPU annotations by adding `default_gpus` and `default_gpu_model` into the list of valid annotations `enable_annotations`. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

stevenhorsman

Looks okay to me. Thanks @Apokleos

bpradipt

/lgtm

Apokleos added 3 commits June 26, 2025 17:27

runtime-rs: Add GPU annotations during remote hypervisor preparation

e5f44fa

Add GPU specific annotations used by remote hypervisor for instance selection during `prepare_vm`. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

Apokleos force-pushed the remote-annotation branch from e415e0a to 8c294b9 Compare June 26, 2025 09:27

runtime-rs: Enable GPU annotations in remote hypervisor configuration

e6e4cd9

Enable GPU annotations by adding `default_gpus` and `default_gpu_model` into the list of valid annotations `enable_annotations`. Fixes kata-containers#10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>

Apokleos force-pushed the remote-annotation branch from 8c294b9 to e6e4cd9 Compare June 26, 2025 09:29

Apokleos mentioned this pull request Jun 26, 2025

A comparison of feature differences between Kata 3.0/runtime-rs and Kata 2.x and alignment status #8702

Open

Apokleos requested review from stevenhorsman, bpradipt, fidencio, burgerdev and lifupan June 26, 2025 12:33

stevenhorsman approved these changes Jun 26, 2025

View reviewed changes

bpradipt approved these changes Jun 26, 2025

View reviewed changes

fidencio added the ok-to-test label Jun 27, 2025

fidencio temporarily deployed to ci June 27, 2025 12:18 — with GitHub Actions Inactive

Apokleos marked this pull request as ready for review June 30, 2025 06:04

Apokleos merged commit e66baf5 into kata-containers:main Jun 30, 2025
507 of 540 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

runtime-rs: Add GPU annotations for remote hypervisor #11474

runtime-rs: Add GPU annotations for remote hypervisor #11474

Uh oh!

Apokleos commented Jun 26, 2025

Uh oh!

stevenhorsman left a comment

Uh oh!

bpradipt left a comment

Uh oh!

Uh oh!

Uh oh!

runtime-rs: Add GPU annotations for remote hypervisor #11474

runtime-rs: Add GPU annotations for remote hypervisor #11474

Uh oh!

Conversation

Apokleos commented Jun 26, 2025

Uh oh!

stevenhorsman left a comment

Choose a reason for hiding this comment

Uh oh!

bpradipt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!