Zero-Shot Goal Recognition with Large Language Models

May 14, 2026 Β· Grace Period Β· πŸ› NeurIPS 2026

⏳ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Kin Max Piamolini GusmΓ£o, Nathan Gavenski, Nir Oren, Felipe Meneguzzi arXiv ID 2605.15333 Category cs.AI: Artificial Intelligence Citations 0 Venue NeurIPS 2026
Abstract
Large language models have recently reached near-parity with classical planners on well-known planning domains, yet this competence relies on world-knowledge exploitation rather than genuine symbolic reasoning. Goal recognition is a complementary abductive task structurally better suited to LLM strengths: it consists of evaluating consistency with world knowledge rather than generating novel action sequences. This paper provides the first systematic zero-shot evaluation of frontier LLMs as goal recognisers on key classical PDDL benchmarks. Our results show that LLM competence on goal recognition is uneven: some models scale with evidence and approach landmark-based accuracy at full observations, while others remain anchored to world-knowledge priors regardless of how much evidence accumulates. Qualitative analysis of model reasoning traces reveals that this divergence reflects a fundamental difference in evidence integration rather than domain familiarity. These findings position goal recognition as a principled benchmark for the foundational planning knowledge of LLMs.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence