-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
SGLang v0.4.x AMD MI300X Workload Debug
Bug report:
- hip vllm version update
- rocm_vllm v0.6.5 deps on outlines==0.1.11
- triton compiler error with decode attention
Share AMD MI300X workable workload below
Reproduction
docker image: rocm/vllm-dev:20241218
pip uninstall vllm
git clone https://github.com/ROCm/vllm.git rocm_vllm & cd rocm_vllm
python setup.py develop & cd ..
git clone https://github.com/sgl-project/sglang.git & cd sglang
vim python/pyproject.toml +21 "orjson", "outlines>=0.1.7", "outlines-core>=0.1.17"
vim python/pyproject.toml +30 vllm==0.6.5.dev411+gd08b78b5.rocm634
vim python/sglang/srt/constrained/outlines_backend.py +23 from outlines_core.fsm.json_schema import build_regex_from_schema
vim python/sglang/srt/layers/attention/triton_ops/decode_attention.py +405 BLOCK=32 -> BLOCK=16
pip install -e "python[all_hip]"
one-batch:
python -m sglang.bench_one_batch --batch-size 32 --input 128 --output 32 --model /data/deepseekv2-lite/ --dp 1 --tp 1 --trust-remote-code
server-client:
python3 -m sglang.launch_server --model-path /data/deepseekv2-lite/ --disable-radix-cache --trust-remote-code --tp 2 --enable-dp-attention --mem-fraction-static 0.78
python3 -m sglang.bench_serving --backend sglang --dataset-name random --random-input 1 --random-output 32 --random-range-ratio 1 --num-prompts 1000
Environment
rocm/vllm-dev:20241218
sglang/main commit d95a5f5
ROCm/vllm/main commit d08b78b50c94239beca3701d286c6d6202b44bd9