MotionLM: Multi-Agent Motion Forecasting as Language Modeling

September 28, 2023 · Declared Dead · 🏛 IEEE International Conference on Computer Vision

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ari Seff, Brian Cera, Dian Chen, Mason Ng, Aurick Zhou, Nigamaa Nayakanti, Khaled S. Refaat, Rami Al-Rfou, Benjamin Sapp arXiv ID 2309.16534 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.LG, cs.RO Citations 160 Venue IEEE International Conference on Computer Vision Last Checked 4 months ago

Abstract

Reliable forecasting of the future behavior of road agents is a critical component to safe planning in autonomous vehicles. Here, we represent continuous trajectories as sequences of discrete motion tokens and cast multi-agent motion prediction as a language modeling task over this domain. Our model, MotionLM, provides several advantages: First, it does not require anchors or explicit latent variable optimization to learn multimodal distributions. Instead, we leverage a single standard language modeling objective, maximizing the average log probability over sequence tokens. Second, our approach bypasses post-hoc interaction heuristics where individual agent trajectory generation is conducted prior to interactive scoring. Instead, MotionLM produces joint distributions over interactive agent futures in a single autoregressive decoding process. In addition, the model's sequential factorization enables temporally causal conditional rollouts. The proposed approach establishes new state-of-the-art performance for multi-agent motion prediction on the Waymo Open Motion Dataset, ranking 1st on the interactive challenge leaderboard.