ManiGAN: Text-Guided Image Manipulation

December 12, 2019 ยท Entered Twilight ยท ๐Ÿ› Computer Vision and Pattern Recognition

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 5.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: DAMSMencoders, README.md, archi.jpg, code, data, eval, models, output

Authors Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr arXiv ID 1912.06203 Category cs.CV: Computer Vision Cross-listed cs.CL, cs.LG Citations 313 Venue Computer Vision and Pattern Recognition Repository https://github.com/mrlibw/ManiGAN โญ 146 Last Checked 1 month ago
Abstract
The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method. Code is available at https://github.com/mrlibw/ManiGAN.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision