[ROCm] fix dtype #4510

yiakwy-xpu-ml-framework-team · 2025-03-17T11:24:17Z

Motivation

fix fp8 dtype problem introduced in #4485

Modifications

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

yiakwy-xpu-ml-framework-team · 2025-03-17T11:25:33Z

cc @HaiShaw

hebiao064 · 2025-03-17T14:32:52Z

Thanks a lot!

hebiao064 · 2025-03-17T21:20:34Z

python/sglang/srt/layers/quantization/w8a8_fp8.py

@@ -108,10 +108,15 @@ def process_weights_after_loading(self, layer: torch.nn.Module) -> None:
                    layer.weight, layer.weight.shape[-1]
                )
                weight_scale = weight_scale.t().contiguous()
+                if _is_hip:


does AMD also supported Cutlass? I thought it's not...

CK_tile (cutlass in AMD I personally believe) is currently recommended to program complex kernel fusion.

It contains tile window - ddr window transformation, pipeliner and partitioner. Very useful. You can check it out. D

And in the recently ASM kernels, they added many asm level optimizations.

fix dtype

3046ce1

yiakwy-xpu-ml-framework-team requested review from merrymercy, Ying1123, zhyncs, ispobock and HaiShaw as code owners March 17, 2025 11:24

merrymercy merged commit 5f9b2c6 into sgl-project:main Mar 17, 2025
19 of 21 checks passed

hebiao064 reviewed Mar 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] fix dtype #4510

[ROCm] fix dtype #4510

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 17, 2025

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

hebiao064 commented Mar 17, 2025

Uh oh!

hebiao064 Mar 17, 2025

Uh oh!

yiakwy-xpu-ml-framework-team Mar 24, 2025

Uh oh!

Uh oh!

[ROCm] fix dtype #4510

[ROCm] fix dtype #4510

Uh oh!

Conversation

yiakwy-xpu-ml-framework-team commented Mar 17, 2025

Motivation

Modifications

Checklist

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

hebiao064 commented Mar 17, 2025

Uh oh!

hebiao064 Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

yiakwy-xpu-ml-framework-team Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 17, 2025 •

edited

Loading