Skip to content

[Feature] Adapt mllama4 to support Vision attention. #8487

@JustinTong0323

Description

@JustinTong0323

Checklist

Motivation

Currently the implementation of vision model in mllama4 is imported from transformers, which may have bad performance. We could implement the modules using sglang's implementation of vision attention

Related resources

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions