-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Description
- Initial Llama 4 Support @CatherineSue @fzyzcjy @ispobock @ch-wan Add Llama4 support #5092
- Llama 4 User Guide @ch-wan @ispobock Add Llama4 user guide #5133
- Vision Backbone Support @mickqian model: support mllama4 #5144
- Local Attention Support in Various Attention Backbones
- FlashAttention V3
- FlashInfer
- Triton
- Quantization
- Kernel Optimization
- Memory Optimization @tarinkk @Pb314314 Hybrid kv cache for LLaMA4 #6563
- EP Optimization
- Llama4 Tool Call Support @CatherineSue feat: support pythonic tool call and index in tool call streaming #5725
slin1237, zwelz3, merrymercy, qingquansong, dukebw and 8 more