-
Notifications
You must be signed in to change notification settings - Fork 159
Description
Hello, thanks for the work! We see many classic SR methods in the paper. The comparison to Real-ESRGAN+ looks promising!
However, it seems that the paper wants to claim that “our method using both synthetic and real world benchmarks demonstrates its superiority over current state-of-the-art approaches”. Just wondering would we have some comparisons to some real baselines and more common methods that people actually use?
For example:
Tiled diffusion’s DDIM inversion:
https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111
ControlNet Tile’s updates yesterday (looks like they are going to use this SR-like model to compete MidjourneyV5/5.1 in image details):
https://github.com/lllyasviel/ControlNet-v1-1-nightly#ControlNet-11-Tile
Loopback Scaler:
https://civitai.com/models/23188/loopback-scaler
DeepFloyd’s 256 stage model (IF-III-L):
https://github.com/deep-floyd/IF
Some of these methods are likely to use prompts, yet it seems that getting a prompt from small image is trivial for BLIP, and all ControlNets have a ‘guessmode’ that can use empty string as prompts. Loopback Scaler and Tiled diffusion seem to suggest people always using same string as prompts whatever the image is so they actually do not require prompts.
Most of these methods can be easily used by installing a latest version of automatic1111.