Migrating Code At Scale With LLMs At Google
April 13, 2025 Β· Declared Dead Β· π SIGSOFT FSE Companion
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Celal Ziftci, Stoyan Nikolov, Anna SjΓΆvall, Bo Kim, Daniele Codecasa, Max Kim
arXiv ID
2504.09691
Category
cs.SE: Software Engineering
Cross-listed
cs.AI
Citations
11
Venue
SIGSOFT FSE Companion
Last Checked
3 months ago
Abstract
Developers often evolve an existing software system by making internal changes, called migration. Moving to a new framework, changing implementation to improve efficiency, and upgrading a dependency to its latest version are examples of migrations. Migration is a common and typically continuous maintenance task undertaken either manually or through tooling. Certain migrations are labor intensive and costly, developers do not find the required work rewarding, and they may take years to complete. Hence, automation is preferred for such migrations. In this paper, we discuss a large-scale, costly and traditionally manual migration project at Google, propose a novel automated algorithm that uses change location discovery and a Large Language Model (LLM) to aid developers conduct the migration, report the results of a large case study, and discuss lessons learned. Our case study on 39 distinct migrations undertaken by three developers over twelve months shows that a total of 595 code changes with 93,574 edits have been submitted, where 74.45% of the code changes and 69.46% of the edits were generated by the LLM. The developers reported high satisfaction with the automated tooling, and estimated a 50% reduction on the total time spent on the migration compared to earlier manual migrations. Our results suggest that our automated, LLM-assisted workflow can serve as a model for similar initiatives.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Software Engineering
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
GraphCodeBERT: Pre-training Code Representations with Data Flow
R.I.P.
π»
Ghosted
DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars
R.I.P.
π»
Ghosted
Microservices: yesterday, today, and tomorrow
R.I.P.
π»
Ghosted
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks
R.I.P.
π»
Ghosted
A Survey of Machine Learning for Big Code and Naturalness
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted