The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective

June 05, 2026 Β· Grace Period Β· πŸ› KDD 2026

⏳ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Xiaoou Liu, Tiejin Chen, Weibo Li, Xiyang Hu, Hua Wei arXiv ID 2606.07017 Category cs.AI: Artificial Intelligence Cross-listed cs.CL, cs.ET Citations 0 Venue KDD 2026
Abstract
Foundation model agents are increasingly deployed for real-world decision-making, but suffer from the sim-to-real gap. While robotics and classical control have mature frameworks to address this gap, the foundation model community is treating agent robustness as an entirely novel phenomenon. Our paper proposes formalizing the foundation model agent evaluation and training gap as a classical sim-to-real problem structured entirely around the four elements of a Markov Decision Process, including Observation, Action, Transition, and Reward. In this paper, we set a comprehensive research agenda that translates classical discrepancies into the foundation model domain and advocates for adopting established solutions like domain randomization. We provide concrete examples, such as a multilingual tool calling to demonstrate how severe observation space gaps lead to operationally invalid actions despite correct semantic intent. Ultimately, this agenda aims to drive a paradigm shift, yielding a unified vocabulary and standardized stress test benchmarks to foster a new generation of highly trustworthy agents for reliable real-world applications.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence