Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

December 05, 2017 · Entered Twilight · 🏛 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, README.md, __init__.py, benchmark, data, demo, docs, experiments, nnutils, preprocess, renderer, utils

Authors Shubham Tulsiani, Saurabh Gupta, David Fouhey, Alexei A. Efros, Jitendra Malik arXiv ID 1712.01812 Category cs.CV: Computer Vision Citations 136 Venue 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Repository https://github.com/shubhtuls/factored3d ⭐ 178 Last Checked 11 days ago

Abstract

The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose. We propose a convolutional neural network-based approach to predict this representation and benchmark it on a large dataset of indoor scenes. Our experiments evaluate a number of practical design questions, demonstrate that we can infer this representation, and quantitatively and qualitatively demonstrate its merits compared to alternate representations.