Patching as Translation: the Data and the Metaphor
August 24, 2020 ยท Entered Twilight ยท ๐ International Conference on Automated Software Engineering
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: Analysis, LICENSE, Model, README.md
Authors
Yangruibo Ding, Baishakhi Ray, Premkumar Devanbu, Vincent J. Hellendoorn
arXiv ID
2008.10707
Category
cs.SE: Software Engineering
Cross-listed
cs.LG,
cs.PL
Citations
74
Venue
International Conference on Automated Software Engineering
Repository
https://github.com/ARiSE-Lab/Patch-as-translation
โญ 7
Last Checked
1 month ago
Abstract
Machine Learning models from other fields, like Computational Linguistics, have been transplanted to Software Engineering tasks, often quite successfully. Yet a transplanted model's initial success at a given task does not necessarily mean it is well-suited for the task. In this work, we examine a common example of this phenomenon: the conceit that "software patching is like language translation". We demonstrate empirically that there are subtle, but critical distinctions between sequence-to-sequence models and translation model: while program repair benefits greatly from the former, general modeling architecture, it actually suffers from design decisions built into the latter, both in terms of translation accuracy and diversity. Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions. Our findings also lend strong support to the recent trend towards synthesizing edits of code conditional on the buggy context, to repair bugs. We implement such models ourselves as "proof-of-concept" tools and empirically confirm that they behave in a fundamentally different, more effective way than the studied translation-based architectures. Overall, our results demonstrate the merit of studying the intricacies of machine learned models in software engineering: not only can this help elucidate potential issues that may be overshadowed by increases in accuracy; it can also help innovate on these models to raise the state-of-the-art further. We will publicly release our replication data and materials at https://github.com/ARiSE-Lab/Patch-as-translation.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Software Engineering
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
GraphCodeBERT: Pre-training Code Representations with Data Flow
R.I.P.
๐ป
Ghosted
DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars
R.I.P.
๐ป
Ghosted
Microservices: yesterday, today, and tomorrow
R.I.P.
๐ป
Ghosted
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks
R.I.P.
๐ป
Ghosted