Method-level Bug Prediction: Problems and Promises

被引:6
作者
Chowdhury, Shaiful [1 ]
Uddin, Gias [2 ]
Hemmati, Hadi [2 ]
Holmes, Reid [3 ]
机构
[1] Univ Manitoba, Dept Comp Sci, E2 445 EITC,75 Chancellors Cir, Winnipeg, MB R3T 5V6, Canada
[2] York Univ, Elect Engn & Comp Sci Dept, 4700 Keele St, Toronto, ON M3J IP3, Canada
[3] Univ British Columbia, ICICS Bldg,201-2366 Main Mall, Vancouver, BC V6T IZ4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Method-level bug prediction; code metrics; maintenance; McCabe; code complexity; SOFTWARE DEFECT PREDICTION; CODE CHURN; METRICS; COMPLEXITY; VALIDATION; MODELS; IMPACT; SIZE;
D O I
10.1145/3640331
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar site generates similar to 113,000 hits if searched with the "bug prediction" phrase. Despite this staggering effort by the research community, bug prediction research is criticized for not being decisively adopted in practice. A significant problem of the existing research is the granularity level (i.e., class/file level) at which bug prediction is historically studied. Practitioners find it difficult and time-consuming to locate bugs at the class/file level granularity. Consequently, method-level bug prediction has become popular in the past decade. We ask, are these method-level bug prediction models ready for industry use? Unfortunately, the answer is no. The reported high accuracies of thesemodels dwindle significantly if we evaluate them in different realistic time-sensitive contexts. It may seem hopeless at first, but, encouragingly, we show that future method-level bug prediction can be improved significantly. In general, we show how to reliably evaluate future method-level bug prediction models and how to improve them by focusing on four different improvement avenues: building noise-free bug data, addressing concept drift, selecting similar training projects, and developing a mixture of models. Our findings are based on three publicly available method-level bug datasets and a newly built bug dataset of 774, 051 Java methods originating from 49 open-source software projects.
引用
收藏
页数:31
相关论文
共 124 条
[11]   The Impact of API Change- and Fault-Proneness on the User Ratings of Android Apps [J].
Bavota, Gabriele ;
Linares-Vasquez, Mario ;
Bernal-Cardenas, Carlos Eduardo ;
Di Penta, Massimiliano ;
Oliveto, Rocco ;
Poshyvanyk, Denys .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (04) :384-407
[12]  
Bell RobertM., 2011, Proceedings of the 7th International Conference on Predictive Models in Software Engineering. Promise'11, DOI DOI 10.1145/2020390.2020392
[13]   Revisiting the Impact of Concept Drift on Just-in-Time Quality Assurance [J].
Bennin, Kwabena E. ;
Ali, Nauman bin ;
Borstler, Jurgen ;
Yu, Xiao .
2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020), 2020, :53-59
[14]   Fair and Balanced? Bias in Bug-Fix Datasets [J].
Bird, Christian ;
Bachmann, Adrian ;
Aune, Eirik ;
Duffy, John ;
Bernstein, Abraham ;
Filkov, Vladimir ;
Devanbu, Premkumar .
7TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2009, :121-130
[15]   Learning a Metric for Code Readability [J].
Buse, Raymond P. L. ;
Weimer, Westley R. .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2010, 36 (04) :546-558
[16]  
Celerity, The True Cost of a Software Bug: Part One
[17]  
Chapin N, 2000, PROC IEEE INT CONF S, P15, DOI 10.1109/ICSM.2000.882970
[18]   Effects of short-term exposure to ambient airborne pollutants on COPD-related mortality among the elderly residents of Chengdu city in Southwest China [J].
Chen, Jianyu ;
Shi, Chunli ;
Li, Yang ;
Ni, Hongzhen ;
Zeng, Jie ;
Lu, Rong ;
Zhang, Li .
ENVIRONMENTAL HEALTH AND PREVENTIVE MEDICINE, 2021, 26 (01)
[19]   SAVIOR: Towards Bug-Driven Hybrid Testing [J].
Chen, Yaohui ;
Li, Peng ;
Xu, Jun ;
Guo, Shengjian ;
Zhou, Rundong ;
Zhang, Yulong ;
Wei, Tao ;
Lu, Long .
2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020), 2020, :1580-1596
[20]   A METRICS SUITE FOR OBJECT-ORIENTED DESIGN [J].
CHIDAMBER, SR ;
KEMERER, CF .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1994, 20 (06) :476-493