Dynamic Object Removal and Spatio-Temporal RGB-D Inpainting via Geometry-Aware Adversarial Learning

August 12, 2020 · Entered Twilight · 🏛 IEEE Transactions on Intelligent Vehicles

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, README.md, checkpoints, datasets.py, demo.py, environment.yml, flow.py, functional.py, img, models, modules.py, utils.py

Authors Borna Bešić, Abhinav Valada arXiv ID 2008.05058 Category cs.CV: Computer Vision Cross-listed cs.LG, cs.RO Citations 37 Venue IEEE Transactions on Intelligent Vehicles Repository https://github.com/robot-learning-freiburg/DynaFill ⭐ 26 Last Checked 24 days ago

Abstract

Dynamic objects have a significant impact on the robot's perception of the environment which degrades the performance of essential tasks such as localization and mapping. In this work, we address this problem by synthesizing plausible color, texture and geometry in regions occluded by dynamic objects. We propose the novel geometry-aware DynaFill architecture that follows a coarse-to-fine topology and incorporates our gated recurrent feedback mechanism to adaptively fuse information from previous timesteps. We optimize our architecture using adversarial training to synthesize fine realistic textures which enables it to hallucinate color and depth structure in occluded regions online in a spatially and temporally coherent manner, without relying on future frame information. Casting our inpainting problem as an image-to-image translation task, our model also corrects regions correlated with the presence of dynamic objects in the scene, such as shadows or reflections. We introduce a large-scale hyperrealistic dataset with RGB-D images, semantic segmentation labels, camera poses as well as groundtruth RGB-D information of occluded regions. Extensive quantitative and qualitative evaluations show that our approach achieves state-of-the-art performance, even in challenging weather conditions. Furthermore, we present results for retrieval-based visual localization with the synthesized images that demonstrate the utility of our approach.