Towards Mitigating API Hallucination in Code Generated by LLMs with Hierarchical Dependency Aware
May 08, 2025 ยท Declared Dead ยท ๐ SIGSOFT FSE Companion
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yujia Chen, Mingyu Chen, Cuiyun Gao, Zhihan Jiang, Zhongqi Li, Yuchi Ma
arXiv ID
2505.05057
Category
cs.SE: Software Engineering
Citations
5
Venue
SIGSOFT FSE Companion
Last Checked
3 months ago
Abstract
Application Programming Interfaces (APIs) are crucial in modern software development. Large Language Models (LLMs) assist in automated code generation but often struggle with API hallucination, including invoking non-existent APIs and misusing existing ones in practical development scenarios. Existing studies resort to Retrieval-Augmented Generation (RAG) methods for mitigating the hallucination issue, but tend to fail since they generally ignore the structural dependencies in practical projects and do not indeed validate whether the generated APIs are available or not. To address these limitations, we propose MARIN, a framework for mitigating API hallucination in code generated by LLMs with hierarchical dependency aware. MARIN consists of two phases: Hierarchical Dependency Mining, which analyzes local and global dependencies of the current function, aiming to supplement comprehensive project context in LLMs input, and Dependency Constrained Decoding, which utilizes mined dependencies to adaptively constrain the generation process, aiming to ensure the generated APIs align with the projects specifications. To facilitate the evaluation of the degree of API hallucination, we introduce a new benchmark APIHulBench and two new metrics including Micro Hallucination Number (MiHN) and Macro Hallucination Rate (MaHR). Experiments on six state-of-the-art LLMs demonstrate that MARIN effectively reduces API hallucinations, achieving an average decrease of 67.52% in MiHN and 73.56% in MaHR compared to the RAG approach. Applied to Huaweis internal projects and two proprietary LLMs, MARIN achieves average decreases of 57.33% in MiHN and 59.41% in MaHR.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Software Engineering
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
GraphCodeBERT: Pre-training Code Representations with Data Flow
R.I.P.
๐ป
Ghosted
DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars
R.I.P.
๐ป
Ghosted
Microservices: yesterday, today, and tomorrow
R.I.P.
๐ป
Ghosted
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks
R.I.P.
๐ป
Ghosted
A Survey of Machine Learning for Big Code and Naturalness
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted