Enhancing Image Layout Control with Loss-Guided Diffusion Models

May 23, 2024 Β· Declared Dead Β· πŸ› IEEE Workshop/Winter Conference on Applications of Computer Vision

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zakaria Patel, Kirill Serkh arXiv ID 2405.14101 Category cs.CV: Computer Vision Cross-listed cs.GR, cs.LG Citations 5 Venue IEEE Workshop/Winter Conference on Applications of Computer Vision Last Checked 3 months ago
Abstract
Diffusion models are a powerful class of generative models capable of producing high-quality images from pure noise using a simple text prompt. While most methods which introduce additional spatial constraints into the generated images (e.g., bounding boxes) require fine-tuning, a smaller and more recent subset of these methods take advantage of the models' attention mechanism, and are training-free. These methods generally fall into one of two categories. The first entails modifying the cross-attention maps of specific tokens directly to enhance the signal in certain regions of the image. The second works by defining a loss function over the cross-attention maps, and using the gradient of this loss to guide the latent. While previous work explores these as alternative strategies, we provide an interpretation for these methods which highlights their complimentary features, and demonstrate that it is possible to obtain superior performance when both methods are used in concert.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Computer Vision

πŸŒ… πŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV πŸ› ICCV πŸ“š 27.7K cites 11 years ago

Died the same way β€” πŸ‘» Ghosted