Skip to content

Support environments where devices come and go frequently #15

@klueska

Description

@klueska

NVIDIA recently released a new feature in its GPUs called MIG (short for Multi-Instance GPU).

This feature allows one to partition a GPU into a set of "MIG devices", each of which appears to the software consuming them as a mini-GPU with a fixed partition of memory and a fixed partition of compute resources.

From a user's perspective, referring to one of these MIG devices is similar to referring to a full GPU (i.e. a unique UUID can be used to specify them). However, unlike full GPUs, the creation / deletion of these MIG devices is highly dynamic.

To support these types of devices, it would be great if an option existed for the runtime to call an executable to generate the CDI spec on the fly for a vendor, rather than simply reading a static file on disk that represents the CDI for a vendor.

NOTE: This doesn't necessarily require any changes to the spec itself, but rather a change in the way it is consumed.

/cc @RenaudWasTaken @mrunalp @bart0sh @kad @adrianchiris

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions