Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution

December 21, 2024 · Declared Dead · 🏛 AAAI Conference on Artificial Intelligence

Authors Jiarui Yang, Tao Dai, Yufei Zhu, Naiqi Li, Jinmin Li, Shutao Xia arXiv ID 2412.16552 Category cs.CV: Computer Vision Cross-listed cs.AI Citations 6 Venue AAAI Conference on Artificial Intelligence Repository https://github.com/JerryYann/DPI} Last Checked 1 month ago

Abstract

Diffusion models represent the state-of-the-art in generative modeling. Due to their high training costs, many works leverage pre-trained diffusion models' powerful representations for downstream tasks, such as face super-resolution (FSR), through fine-tuning or prior-based methods. However, relying solely on priors without supervised training makes it challenging to meet the pixel-level accuracy requirements of discrimination task. Although prior-based methods can achieve high fidelity and high-quality results, ensuring consistency remains a significant challenge. In this paper, we propose a masking strategy with strong and weak constraints and iterative refinement for real-world FSR, termed Diffusion Prior Interpolation (DPI). We introduce conditions and constraints on consistency by masking different sampling stages based on the structural characteristics of the face. Furthermore, we propose a condition Corrector (CRT) to establish a reciprocal posterior sampling process, enhancing FSR performance by mutual refinement of conditions and samples. DPI can balance consistency and diversity and can be seamlessly integrated into pre-trained models. In extensive experiments conducted on synthetic and real datasets, along with consistency validation in face recognition, DPI demonstrates superiority over SOTA FSR methods. The code is available at \url{https://github.com/JerryYann/DPI}.