Comparing experimental triton and torch implementations of deepseek inspired MLA (in progress... but you can run some experiments. triton kernels for deepseek mla already work but wouldn't bet they are completely correct atm. Need to verify.
-
Notifications
You must be signed in to change notification settings - Fork 0
Implementing multi-latent-attention DeepSeek style in torch and triton and comparing performance.
License
juvi21/whale-MLA-triton
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Implementing multi-latent-attention DeepSeek style in torch and triton and comparing performance.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published