π
π
The Cartographer
Reasoning Structure of Large Language Models
June 02, 2026 Β· Grace Period Β· π ICML 2026 and presented at the ICLR 2026 workshop on LLM reasoning
Authors
FrΓ©dΓ©ric Berdoz, Luca A. LanzendΓΆrfer, Fabian Farestam, Roger Wattenhofer
arXiv ID
2606.03883
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.LG
Citations
0
Venue
ICML 2026 and presented at the ICLR 2026 workshop on LLM reasoning
Abstract
Large reasoning models (LRMs) are often evaluated using metrics such as final-answer accuracy or token count. However, identical scores on these metrics can hide fundamentally different reasoning structures. To address this limitation, we introduce a scalable LRM benchmark of logic puzzles and a pipeline that converts unstructured traces into verifiable reasoning graphs of claims and dependencies. This turns reasoning into a structured, measurable object whose topology can be quantitatively analyzed. Building on this, we define a reasoning efficiency metric that quantifies how concentrated the model's logical flow is. Our analysis on open-source reasoning models shows that structural measurements separate behaviors that token count and accuracy conflate, providing a practical tool for diagnosing failure modes and comparing how reasoning scales with puzzle difficulty.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted