-
Notifications
You must be signed in to change notification settings - Fork 21
Closed
Description
Thank you very much for your excellent open-source work!
In my testing, I've experimented with using SigLIP to encode the low-level features of a reference image. I found that this indeed improved editing capabilities, but I also observed that some details of the original image were altered, such as facial identity information (ID).
I've noticed that a number of existing research works combine SigLIP and a VAE to encode image features. This has led me to a couple of questions for the author(s):
- In your view, is this combined (SigLIP + VAE) approach a feasible direction for your project?
- Alternatively, if your team has already explored this approach, were there other issues or challenges encountered that ultimately led to not adopting it?
Metadata
Metadata
Assignees
Labels
No labels