Early Life Cycle Software Defect Prediction. Why? How?

November 26, 2020 ยท Declared Dead ยท ๐Ÿ› International Conference on Software Engineering

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors N. C. Shrikanth, Suvodeep Majumder, Tim Menzies arXiv ID 2011.13071 Category cs.SE: Software Engineering Cross-listed cs.LG Citations 30 Venue International Conference on Software Engineering Last Checked 3 months ago
Abstract
Many researchers assume that, for software analytics, "more data is better." We write to show that, at least for learning defect predictors, this may not be true. To demonstrate this, we analyzed hundreds of popular GitHub projects. These projects ran for 84 months and contained 3,728 commits (median values). Across these projects, most of the defects occur very early in their life cycle. Hence, defect predictors learned from the first 150 commits and four months perform just as well as anything else. This means that, at least for the projects studied here, after the first few months, we need not continually update our defect prediction models. We hope these results inspire other researchers to adopt a "simplicity-first" approach to their work. Some domains require a complex and data-hungry analysis. But before assuming complexity, it is prudent to check the raw data looking for "short cuts" that can simplify the analysis.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Software Engineering

Died the same way โ€” ๐Ÿ‘ป Ghosted