BobQC

Follow

BobQC BobQC

Follow

Achievements

Achievements

Highlights

Pro

Popular repositories Loading

SageAttention SageAttention Public

Forked from thu-ml/SageAttention

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda