-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
Following error occurs when trying to start server mode:
Error:
File "/sgl-workspace/sglang/python/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py", line 1250, in fused_experts
torch.ops.sglang.inplace_fused_experts(
File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 1122, in call
return self._op(*args, **(kwargs or {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sgl-workspace/sglang/python/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py", line 1095, in inplace_fused_experts
fused_experts_impl(
File "/sgl-workspace/sglang/python/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py", line 1424, in fused_experts_impl
invoke_fused_moe_kernel(
File "/sgl-workspace/sglang/python/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py", line 789, in invoke_fused_moe_kernel
A, A_scale = per_token_group_quant_fp8(A, block_k)
^^^^^^^^^^^^^^^^^^^^^^^^^
NameError: name 'per_token_group_quant_fp8' is not defined. Did you mean: 'per_token_group_quant_int8'?
[2025-04-07 19:18:42] Received sigquit from a child process. It usually means the child failed.
--- Logging error ---
[2025-04-07 19:18:42] Received sigquit from a child process. It usually means the child failed.
[2025-04-07 19:18:42] Received sigquit from a child process. It usually means the child failed.
Reproduction
python3 -m sglang.launch_server --model /deepseek/DeepSeek-R1 --tp 8 --trust-remote-code --chunked-prefill-size 131072 --enable-torch-compile --torch-compile-max-bs 256
Environment
lmsysorg/sglang:v0.4.5-rocm630