Taming Mode Collapse in Score Distillation for Text-to-3D Generation

December 31, 2023 ยท Entered Twilight ยท ๐Ÿ› Computer Vision and Pattern Recognition

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: README.md, docs, gaussian_example.py, materials, teaser.gif

Authors Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra arXiv ID 2401.00909 Category cs.CV: Computer Vision Cross-listed cs.LG Citations 40 Venue Computer Vision and Pattern Recognition Repository https://github.com/vita-group/3D-Mode-Collapse โญ 51 Last Checked 9 days ago
Abstract
Despite the remarkable performance of score distillation in text-to-3D generation, such techniques notoriously suffer from view inconsistency issues, also known as "Janus" artifact, where the generated objects fake each view with multiple front faces. Although empirically effective methods have approached this problem via score debiasing or prompt engineering, a more rigorous perspective to explain and tackle this problem remains elusive. In this paper, we reveal that the existing score distillation-based text-to-3D generation frameworks degenerate to maximal likelihood seeking on each view independently and thus suffer from the mode collapse problem, manifesting as the Janus artifact in practice. To tame mode collapse, we improve score distillation by re-establishing the entropy term in the corresponding variational objective, which is applied to the distribution of rendered images. Maximizing the entropy encourages diversity among different views in generated 3D assets, thereby mitigating the Janus problem. Based on this new objective, we derive a new update rule for 3D score distillation, dubbed Entropic Score Distillation (ESD). We theoretically reveal that ESD can be simplified and implemented by just adopting the classifier-free guidance trick upon variational score distillation. Although embarrassingly straightforward, our extensive experiments successfully demonstrate that ESD can be an effective treatment for Janus artifacts in score distillation.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision