VLMPlanner: Integrating Visual Language Models with Motion Planning
July 27, 2025 · Declared Dead · 🏛 ACM Multimedia
"Paper promises code 'coming soon'"
Evidence collected by the PWNC Scanner
Authors
Zhipeng Tang, Sha Zhang, Jiajun Deng, Chenjie Wang, Guoliang You, Yuting Huang, Xinrui Lin, Yanyong Zhang
arXiv ID
2507.20342
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.RO
Citations
3
Venue
ACM Multimedia
Last Checked
1 month ago
Abstract
Integrating large language models (LLMs) into autonomous driving motion planning has recently emerged as a promising direction, offering enhanced interpretability, better controllability, and improved generalization in rare and long-tail scenarios. However, existing methods often rely on abstracted perception or map-based inputs, missing crucial visual context, such as fine-grained road cues, accident aftermath, or unexpected obstacles, which are essential for robust decision-making in complex driving environments. To bridge this gap, we propose VLMPlanner, a hybrid framework that combines a learning-based real-time planner with a vision-language model (VLM) capable of reasoning over raw images. The VLM processes multi-view images to capture rich, detailed visual information and leverages its common-sense reasoning capabilities to guide the real-time planner in generating robust and safe trajectories. Furthermore, we develop the Context-Adaptive Inference Gate (CAI-Gate) mechanism that enables the VLM to mimic human driving behavior by dynamically adjusting its inference frequency based on scene complexity, thereby achieving an optimal balance between planning performance and computational efficiency. We evaluate our approach on the large-scale, challenging nuPlan benchmark, with comprehensive experimental results demonstrating superior planning performance in scenarios with intricate road conditions and dynamic elements. Code will be available.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Artificial Intelligence
R.I.P.
👻
Ghosted
R.I.P.
👻
Ghosted
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
R.I.P.
👻
Ghosted
Addressing Function Approximation Error in Actor-Critic Methods
R.I.P.
👻
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
👻
Ghosted
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
R.I.P.
👻
Ghosted
Complex Embeddings for Simple Link Prediction
Died the same way — ⏳ Coming Soon™
R.I.P.
⏳
Coming Soon™
Exploring Simple Siamese Representation Learning
R.I.P.
⏳
Coming Soon™
An Analysis of Scale Invariance in Object Detection - SNIP
R.I.P.
⏳
Coming Soon™
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
R.I.P.
⏳
Coming Soon™