-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Closed
Description
Is there any update here ? With nf4 quantized flux models, i could not use any lora
Update: NF4 serialization and loading are working fine. @DN6 let's brainstorm how we can support it more easily? This would help us unlock doing LoRAs on the quantized weights, too (cc: @BenjaminBossan for PEFT). I think this will become evidently critical for larger models.
transformers
has a nice reference for us to follow. Additionally,accelerate
has: https://huggingface.co/docs/accelerate/en/usage_guides/quantization, but it doesn't support NF4 serialization yet.Cc: @SunMarc for jamming on this together.
Originally posted by @sayakpaul in #9165 (comment)
Metadata
Metadata
Assignees
Labels
No labels