Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
November 29, 2023 ยท Entered Twilight ยท ๐ Computer Vision and Pattern Recognition
Repo contents: .github, .gitignore, CITATION.cff, CODE_OF_CONDUCT.md, CONTRIBUTING.md, LICENSE, MANIFEST.in, Makefile, PHILOSOPHY.md, README.md, _typos.toml, docker, docs, examples, pyproject.toml, scripts, setup.py, src, tests, utils
Authors
Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, Zhaoxiang Zhang
arXiv ID
2311.17918
Category
cs.CV: Computer Vision
Citations
256
Venue
Computer Vision and Pattern Recognition
Repository
https://github.com/BraveGroup/Drive-WM
โญ 410
Last Checked
1 month ago
Abstract
In autonomous driving, predicting future events in advance and evaluating the foreseeable risks empowers autonomous vehicles to better plan their actions, enhancing safety and efficiency on the road. To this end, we propose Drive-WM, the first driving world model compatible with existing end-to-end planning models. Through a joint spatial-temporal modeling facilitated by view factorization, our model generates high-fidelity multiview videos in driving scenes. Building on its powerful generation ability, we showcase the potential of applying the world model for safe driving planning for the first time. Particularly, our Drive-WM enables driving into multiple futures based on distinct driving maneuvers, and determines the optimal trajectory according to the image-based rewards. Evaluation on real-world driving datasets verifies that our method could generate high-quality, consistent, and controllable multiview videos, opening up possibilities for real-world simulations and safe planning.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
R.I.P.
๐ป
Ghosted