๐
๐
Old Age
Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
April 01, 2024 ยท Entered Twilight ยท ๐ Computer Vision and Pattern Recognition
Repo contents: README.md, drag_bench_evaluation, drag_pipeline.py, drag_ui.py, environment.yaml, image, local_pretrained_models, lora, utils
Authors
Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, Shengfeng He
arXiv ID
2404.01050
Category
cs.CV: Computer Vision
Cross-listed
cs.GR,
cs.HC,
cs.LG
Citations
52
Venue
Computer Vision and Pattern Recognition
Repository
https://github.com/haofengl/DragNoise
โญ 87
Last Checked
1 month ago
Abstract
Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of the original content and unsuccessful editing due to gradient vanishing. In contrast, we present DragNoise, offering robust and accelerated editing without retracing the latent map. The core rationale of DragNoise lies in utilizing the predicted noise output of each U-Net as a semantic editor. This approach is grounded in two critical observations: firstly, the bottleneck features of U-Net inherently possess semantically rich features ideal for interactive editing; secondly, high-level semantics, established early in the denoising process, show minimal variation in subsequent stages. Leveraging these insights, DragNoise edits diffusion semantics in a single denoising step and efficiently propagates these changes, ensuring stability and efficiency in diffusion editing. Comparative experiments reveal that DragNoise achieves superior control and semantic retention, reducing the optimization time by over 50% compared to DragDiffusion. Our codes are available at https://github.com/haofengl/DragNoise.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
R.I.P.
๐ป
Ghosted