Update Triton configs for block fp8 kernels #2641

HandH1998 · 2024-12-29T14:02:11Z

Update Triton configs for block fp8 kernels

BBuf · 2025-01-02T11:46:35Z

benchmark/kernels/fused_moe_triton/tuning_fused_moe_triton.py

@@ -418,8 +418,7 @@ def _distribute(method: str, inputs: List[Any]) -> List[Any]:
            search_space = [
                config
                for config in search_space
-                if block_n % config["BLOCK_SIZE_N"] == 0


It seems. that the change would reduce the search space in normal cases, which might have a slight impact on fused_moe_triton performance.

cc @HandH1998

@BBuf Do you mean the removed line if block_n % config["BLOCK_SIZE_N"] == 0? I think it will make the search space larger after removing the line, as fewer limitations are required.

if block_k % config["BLOCK_SIZE_K"] == 0 is required by the block w8a8 fp8 gemm. In the main loop of the gemm, this limitation can ensure that only a quantization scale is needed.

update triton configs for block fp8 kernels

d4dbf4a

HandH1998 requested review from zhyncs, ispobock, HaiShaw, merrymercy and Ying1123 as code owners December 29, 2024 14:02

HandH1998 changed the title ~~Update Trion configs for block fp8 kernels~~ Update Triton configs for block fp8 kernels Dec 29, 2024

zhyncs merged commit afa0341 into main Dec 29, 2024
17 checks passed

zhyncs deleted the tune_kernel branch December 29, 2024 14:53

robertgshaw2-redhat mentioned this pull request Dec 29, 2024

[Kernel] Triton Configs for Fp8 Block Quantization vllm-project/vllm#11589

Merged

BBuf reviewed Jan 2, 2025

View reviewed changes

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

Update Triton configs for block fp8 kernels (sgl-project#2641)

8ae23aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Triton configs for block fp8 kernels #2641

Update Triton configs for block fp8 kernels #2641

Uh oh!

HandH1998 commented Dec 29, 2024 •

edited

Loading

Uh oh!

Uh oh!

BBuf Jan 2, 2025 •

edited

Loading

Uh oh!

zhyncs Jan 2, 2025

Uh oh!

HandH1998 Jan 3, 2025

Uh oh!

HandH1998 Jan 3, 2025

Uh oh!

Uh oh!

Update Triton configs for block fp8 kernels #2641

Update Triton configs for block fp8 kernels #2641

Uh oh!

Conversation

HandH1998 commented Dec 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

BBuf Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhyncs Jan 2, 2025

Choose a reason for hiding this comment

Uh oh!

HandH1998 Jan 3, 2025

Choose a reason for hiding this comment

Uh oh!

HandH1998 Jan 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HandH1998 commented Dec 29, 2024 •

edited

Loading

BBuf Jan 2, 2025 •

edited

Loading