FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses via Pixel-Aligned Scene Flow

May 31, 2023 ยท Entered Twilight ยท ๐Ÿ› Neural Information Processing Systems

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: README.md, conv_modules.py, data, demo.py, eval.py, geometry.py, mlp_modules.py, models.py, renderer.py, run.py, train.py, vis_scripts.py, wandb_logging.py

Authors Cameron Smith, Yilun Du, Ayush Tewari, Vincent Sitzmann arXiv ID 2306.00180 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.GR, cs.LG Citations 39 Venue Neural Information Processing Systems Repository https://github.com/cameronosmith/flowcam โญ 147 Last Checked 16 days ago
Abstract
Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning. The key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion, which is prohibitively expensive to run at scale. We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass. We estimate poses by first lifting frame-to-frame optical flow to 3D scene flow via differentiable rendering, preserving locality and shift-equivariance of the image processing backbone. SE(3) camera pose estimation is then performed via a weighted least-squares fit to the scene flow field. This formulation enables us to jointly supervise pose estimation and a generalizable neural scene representation via re-rendering the input video, and thus, train end-to-end and fully self-supervised on real-world video datasets. We demonstrate that our method performs robustly on diverse, real-world video, notably on sequences traditionally challenging to optimization-based pose estimation techniques.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision