Skip to content

[Feature] integrate flash-attention #4385

@zhyncs

Description

@zhyncs

Checklist

Motivation

similar with #4384

Since SGLang now supports page sizes greater than 1 #4356, we should integrate flash-attention https://github.com/Dao-AILab/flash-attention

Related resources

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions