Speed3R: Sparse Feed-forward 3D Reconstruction Models

March 09, 2026 ยท Grace Period ยท ๐Ÿ› CVPR 2026 Findings

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Weining Ren, Xiao Tan, Kai Han arXiv ID 2603.08055 Category cs.CV: Computer Vision Cross-listed cs.AI Citations 0 Venue CVPR 2026 Findings
Abstract
While recent feed-forward 3D reconstruction models accelerate 3D reconstruction by jointly inferring dense geometry and camera poses in a single pass, their reliance on dense attention imposes a quadratic complexity, creating a prohibitive computational bottleneck that severely limits inference speed. To resolve this, we introduce Speed3R, an end-to-end trainable model inspired by the core principle of Structure-from-Motion: that a sparse set of keypoints is sufficient for robust pose estimation. Speed3R features a dual-branch attention mechanism where a compression branch creates a coarse contextual prior to guide a selection branch, which performs fine-grained attention only on the most informative image tokens. This strategy mimics the efficiency of traditional keypoint matching, achieving a remarkable 12.4x inference speedup on 1000-view sequences, while introducing a minimal, controlled trade-off in geometric accuracy. Validated on standard benchmarks with both VGGT and $ฯ€^3$ backbones, our method delivers high-quality reconstructions at a fraction of computational cost, paving the way for efficient large-scale scene modeling.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision