R.I.P.
👻
Ghosted
Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability
June 02, 2025 · Declared Dead · 🏛 Annual Meeting of the Association for Computational Linguistics
Authors
Mengliang He, Jiayi Zeng, Yankai Jiang, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou
arXiv ID
2506.02073
Category
cs.SE: Software Engineering
Cross-listed
cs.AI
Citations
5
Venue
Annual Meeting of the Association for Computational Linguistics
Repository
https://github.com/hml-github/Flow2Code
⭐ 1
Last Checked
1 month ago
Abstract
While large language models (LLMs) show promise in code generation, existing benchmarks neglect the flowchart-based code generation. To promote further research on flowchart-based code generation, this work presents Flow2Code, a novel benchmark for flowchart-based code generation evaluation. The evaluation dataset spans 15 programming languages and includes 5,622 code segments paired with 16,866 flowcharts of three types: code, UML, and pseudocode. Extensive experiments with 13 multimodal LLMs reveal that current LLMs can not generate code based on flowcharts perfectly. Besides, experiment results show that the supervised fine-tuning technique contributes greatly to the models' performance. We publicly release our code and datasets at https://github.com/hml-github/Flow2Code.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Software Engineering
R.I.P.
👻
Ghosted
GraphCodeBERT: Pre-training Code Representations with Data Flow
R.I.P.
👻
Ghosted
DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars
R.I.P.
👻
Ghosted
Microservices: yesterday, today, and tomorrow
R.I.P.
👻
Ghosted
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks
R.I.P.
👻
Ghosted
A Survey of Machine Learning for Big Code and Naturalness
Died the same way — ⚰️ The Empty Tomb
R.I.P.
⚰️
The Empty Tomb
DSFD: Dual Shot Face Detector
R.I.P.
⚰️
The Empty Tomb
InstanceCut: from Edges to Instances with MultiCut
R.I.P.
⚰️
The Empty Tomb
FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis
R.I.P.
⚰️
The Empty Tomb