Skip to content

NF4 quantized flux models with loras #10496

@hamzaakyildiz

Description

@hamzaakyildiz

Is there any update here ? With nf4 quantized flux models, i could not use any lora

Update: NF4 serialization and loading are working fine. @DN6 let's brainstorm how we can support it more easily? This would help us unlock doing LoRAs on the quantized weights, too (cc: @BenjaminBossan for PEFT). I think this will become evidently critical for larger models.

transformers has a nice reference for us to follow. Additionally, accelerate has: https://huggingface.co/docs/accelerate/en/usage_guides/quantization, but it doesn't support NF4 serialization yet.

Cc: @SunMarc for jamming on this together.

Originally posted by @sayakpaul in #9165 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions