Skip to content

Conversation

whchung
Copy link
Contributor

@whchung whchung commented Feb 16, 2025

Modifications

Add additional block quant GEMM tuning configs for AMD GPUs.

Checklist

Copy link
Collaborator

@HaiShaw HaiShaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@yiakwy-xpu-ml-framework-team
Copy link
Contributor

Hi @whchung, do we have profiling comparison I am really interested in the parameter choosing of "BLOCK_SIZE_N" between 16 and 64.

In the last year we have paper fully study the parameter choosing. The study shows that parameters typically 16, 64, 128, which is deep related to memory transaction bandwidth.

@HaiShaw HaiShaw enabled auto-merge (squash) February 17, 2025 06:53
@HaiShaw HaiShaw self-requested a review February 17, 2025 06:55
@HaiShaw HaiShaw disabled auto-merge February 17, 2025 06:56
@HaiShaw HaiShaw enabled auto-merge (squash) February 17, 2025 06:57
@saienduri saienduri disabled auto-merge February 17, 2025 23:54
@saienduri saienduri merged commit 2eab113 into sgl-project:main Feb 17, 2025
3 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants