Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution

December 30, 2023 Β· Declared Dead Β· πŸ› IEEE Transactions on Image Processing

πŸ’€ CAUSE OF DEATH: 404 Not Found
Code link is broken/dead
Authors Lingchen Sun, Rongyuan Wu, Jie Liang, Zhengqiang Zhang, Hongwei Yong, Lei Zhang arXiv ID 2401.00877 Category eess.IV: Image & Video Processing Cross-listed cs.CV Citations 7 Venue IEEE Transactions on Image Processing Repository https://github.com/csslc/CCSR}{https://github.com/csslc/CCSR} Last Checked 2 months ago
Abstract
The generative priors of pre-trained latent diffusion models (DMs) have demonstrated great potential to enhance the visual quality of image super-resolution (SR) results. However, the noise sampling process in DMs introduces randomness in the SR outputs, and the generated contents can differ a lot with different noise samples. The multi-step diffusion process can be accelerated by distilling methods, but the generative capacity is difficult to control. To address these issues, we analyze the respective advantages of DMs and generative adversarial networks (GANs) and propose to partition the generative SR process into two stages, where the DM is employed for reconstructing image structures and the GAN is employed for improving fine-grained details. Specifically, we propose a non-uniform timestep sampling strategy in the first stage. A single timestep sampling is first applied to extract the coarse information from the input image, then a few reverse steps are used to reconstruct the main structures. In the second stage, we finetune the decoder of the pre-trained variational auto-encoder by adversarial GAN training for deterministic detail enhancement. Once trained, our proposed method, namely content consistent super-resolution (CCSR),allows flexible use of different diffusion steps in the inference stage without re-training. Extensive experiments show that with 2 or even 1 diffusion step, CCSR can significantly improve the content consistency of SR outputs while keeping high perceptual quality. Codes and models can be found at \href{https://github.com/csslc/CCSR}{https://github.com/csslc/CCSR}.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Image & Video Processing

Died the same way β€” πŸ’€ 404 Not Found