Incorrect GPU resource request when preferred nodes are used

**Describe the bug**

I have two preferred nodes with instance type `Standard_NC80adis_H100_v5`, which has 2 H100 each (total 4 GPUs). However, when I deployed the following workspace, the two pods generated by the managed StatefulSet are requesting 4 GPUs each:

```
apiVersion: kaito.sh/v1beta1
kind: Workspace
metadata:
  name: workspace-llama-3-3-70b-instruct
resource:
  count: 2
  instanceType: "Standard_NC80adis_H100_v5"
  labelSelector:
    matchLabels:
      node.kubernetes.io/instance-type: Standard_NC80adis_H100_v5
inference:
  preset:
    name: llama-3.3-70b-instruct
    presetOptions:
      modelAccessSecret: hf-token
  config: "llama-inference-params"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: "llama-inference-params"
data:
  inference_config.yaml: |
    vllm:
      cpu-offload-gb: 0
      gpu-memory-utilization: 0.95
      swap-space: 4
      max-model-len: 16384

```

<img width="323" height="148" alt="Image" src="https://github.com/user-attachments/assets/a3ea5a32-90b2-4a87-ae21-2f2b803e6517" />

<img width="1639" height="126" alt="Image" src="https://github.com/user-attachments/assets/2996227a-8a71-4119-8d9d-90d4d00fad8f" />

<img width="1672" height="114" alt="Image" src="https://github.com/user-attachments/assets/f13725dd-40c0-4147-8f10-1980f48aa8f5" />

**Expected behavior**

Each pod should request 2 GPUs, based on how many available GPUs there are in the preferred nodes

**Logs**

I was debugging it and noticed the field selector [here](https://github.com/kaito-project/kaito/blob/main/pkg/utils/common.go#L207) is not working:

<img width="730" height="75" alt="Image" src="https://github.com/user-attachments/assets/8de989f7-4ade-44bf-a60c-1b67ff3c5de7" />

causing [this function](https://github.com/kaito-project/kaito/blob/main/pkg/workspace/inference/preset-inferences.go#L223-L229) to fall back to the default number of GPUs required for the model

**Environment**

- Kubernetes version (use `kubectl version`):
- OS (e.g: `cat /etc/os-release`):
- Install tools:
- Others:

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect GPU resource request when preferred nodes are used #1335

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect GPU resource request when preferred nodes are used #1335

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions