-
Notifications
You must be signed in to change notification settings - Fork 189
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Steps to reproduce
I created dstackssh fleet
with hotaisle's single Mi300x GPU using dstack Sky
. Then when I applied service config, validation error occurred.
Steps to reproduce:
- dstack apply -f hotaisle.fleet.yml
- dstack apply -f .dstack.yml
Configurations:
- .dstack.yml
type: service
name: deepseek-r1-amd
image: rocm/vllm:rocm6.3.1_instinct_vllm0.8.3_20250410
env:
- MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-70B
- MAX_MODEL_LEN=126432
commands:
- vllm serve $MODEL_ID
--max-model-len $MAX_MODEL_LEN
--trust-remote-code
port: 8000
model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
volumes:
- /root/.cache/huggingface/hub:/root/.cache/huggingface/hub
resources:
gpu: mi300x
disk: 300Gb..
2.hotaisle.fleet.yml
type: fleet
# The name is optional, if not specified, generated randomly
name: hotaisle-fleet
# Uncomment if instances are interconnected
#placement: cluster
# SSH credentials for the on-prem servers
ssh_config:
user: hotaisle
identity_file: ~/.ssh/id_rsa
hosts:
- 23.183.40.75
Actual behaviour
[shim.log](https://github.com/user-attachments/files/19850803/shim.log)
$dstack apply -f .dstack.yml
Project dstack-team-pool
User Bihan
Configuration .dstack.yml
Type service
Resources 2..xCPU, 8GB.., 1xmi300x, 300GB.. (disk)
Max price -
Max duration -
Spot policy on-demand
Retry policy -
Creation policy reuse-or-create
Idle duration 5m
Reservation -
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 ssh remote instance 8xCPU, 220GB, 1xMI300X (192GB), 11149.2GB (disk) no $0 idle
Finished run deepseek-r1-amd already exists.
Override the run? [y/n]: y
Server validation error:
{'detail': [{'loc': ['body',
'plan',
'current_resource',
'run_spec',
'configuration',
'ServiceConfigurationRequest',
'tags'],
'msg': 'extra fields not permitted',
'type': 'value_error.extra'},
{'loc': ['body',
'plan',
'current_resource',
'run_spec',
'profile',
'tags'],
'msg': 'extra fields not permitted',
'type': 'value_error.extra'},
{'loc': ['body',
'plan',
'current_resource',
'run_spec',
'__root__'],
'msg': 'Missing configuration',
'type': 'value_error'}]}
Expected behaviour
Submit the run deepseek-r1-amd? [y/n]: y
NAME BACKEND RESOURCES PRICE STATUS SUBMITTED
deepseek-r1-amd ssh (remote) 8xCPU, 220GB, 1xMI300X (192GB), 11451.2GB (disk) $0.0 running 09:35
deepseek-r1-amd provisioning completed (running)
Service is published at:
https://deepseek-r1-amd.dstack-team-pool.sky.dstack.ai/
Model deepseek-ai/DeepSeek-R1-Distill-Llama-70B is published at:
https://gateway.dstack-team-pool.sky.dstack.ai/
INFO 04-22 03:54:14 [__init__.py:239] Automatically detected platform rocm.
INFO 04-22 03:54:15 [api_server.py:1034] vLLM API server version 0.8.3.dev349+gb8498bc4a
INFO 04-22 03:54:15 [api_server.py:1035] args: Namespace(subparser='serve', model_tag='deepseek-ai/DeepSeek-R1-Distill-Llama-70B', config='',
...
...
...
INFO 04-22 03:54:38 [model_runner.py:1110] Starting to load model deepseek-ai/DeepSeek-R1-Distill-Llama-70B...
INFO 04-22 03:54:38 [weight_utils.py:265] Using model weights format ['*.safetensors']
model-00003-of-000017.safetensors: 100% 1.58G/1.58G [00:10<00:00, 149MB/s]
model-00005-of-000017.safetensors: 100% 8.42G/8.42G [00:55<00:00, 152MB/s]
model-00007-of-000017.safetensors: 100% 8.42G/8.42G [00:56<00:00, 150MB/s]
model-00002-of-000017.safetensors: 100% 8.69G/8.69G [00:57<00:00, 151MB/s]
model-00004-of-000017.safetensors: 100% 8.69G/8.69G [00:57<00:00, 150MB/s]
model-00006-of-000017.safetensors: 100% 8.69G/8.69G [00:57<00:00, 150MB/s]
model-00008-of-000017.safetensors: 100% 8.69G/8.69G [00:57<00:00, 150MB/s]
model-00001-of-000017.safetensors: 100% 8.95G/8.95G [00:58<00:00, 152MB/s]
model-00009-of-000017.safetensors: 100% 8.42G/8.42G [00:56<00:00, 150MB/s]
model-00011-of-000017.safetensors: 100% 8.42G/8.42G [00:53<00:00, 156MB/s]
model-00010-of-000017.safetensors: 100% 8.69G/8.69G [00:55<00:00, 157MB/s]
model-00013-of-000017.safetensors: 100% 8.42G/8.42G [00:53<00:00, 158MB/s]
model-00015-of-000017.safetensors: 100% 8.42G/8.42G [00:53<00:00, 157MB/s]
model-00012-of-000017.safetensors: 100% 8.69G/8.69G [00:54<00:00, 160MB/s]
model-00014-of-000017.safetensors: 100% 8.69G/8.69G [00:54<00:00, 158MB/s]
model-00016-of-000017.safetensors: 100% 8.69G/8.69G [00:54<00:00, 160MB/s]
model-00017-of-000017.safetensors: 100% 10.5G/10.5G [00:58<00:00, 181MB/s]
...
...
dstack version
Dstack Repo Version: 2102b1b
Server logs
Additional information
Works perfectly well with dstack cli 0.19.4.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working