Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures

March 08, 2019 ยท Entered Twilight ยท ๐Ÿ› IEEE Workshop/Winter Conference on Applications of Computer Vision

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, README.md, data.py, data, dops.py, instats.py, model.py, pdata.py, popstats.py, pproc.py, ptrain.py, run.py, slib, train.py, utils.py, wts

Authors Kyle Yee, Ayan Chakrabarti arXiv ID 1903.04939 Category cs.CV: Computer Vision Cross-listed cs.RO Citations 35 Venue IEEE Workshop/Winter Conference on Applications of Computer Vision Repository https://github.com/ayanc/fdscs โญ 46 Last Checked 5 days ago
Abstract
Modern neural network-based algorithms are able to produce highly accurate depth estimates from stereo image pairs, nearly matching the reliability of measurements from more expensive depth sensors. However, this accuracy comes with a higher computational cost since these methods use network architectures designed to compute and process matching scores across all candidate matches at all locations, with floating point computations repeated across a match volume with dimensions corresponding to both space and disparity. This leads to longer running times to process each image pair, making them impractical for real-time use in robots and autonomous vehicles. We propose a new stereo algorithm that employs a significantly more efficient network architecture. Our method builds an initial match cost volume using traditional matching costs that are fast to compute, and trains a network to estimate disparity from this volume. Crucially, our network only employs per-pixel and two-dimensional convolution operations: to summarize the match information at each location as a low-dimensional feature vector, and to spatially process these `cost-signature' features to produce a dense disparity map. Experimental results on the KITTI benchmark show that our method delivers competitive accuracy at significantly higher speeds---running at 48 frames per second on a modern GPU.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision