Method-level Bug Prediction: Problems and Promises

被引:6
作者
Chowdhury, Shaiful [1 ]
Uddin, Gias [2 ]
Hemmati, Hadi [2 ]
Holmes, Reid [3 ]
机构
[1] Univ Manitoba, Dept Comp Sci, E2 445 EITC,75 Chancellors Cir, Winnipeg, MB R3T 5V6, Canada
[2] York Univ, Elect Engn & Comp Sci Dept, 4700 Keele St, Toronto, ON M3J IP3, Canada
[3] Univ British Columbia, ICICS Bldg,201-2366 Main Mall, Vancouver, BC V6T IZ4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Method-level bug prediction; code metrics; maintenance; McCabe; code complexity; SOFTWARE DEFECT PREDICTION; CODE CHURN; METRICS; COMPLEXITY; VALIDATION; MODELS; IMPACT; SIZE;
D O I
10.1145/3640331
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar site generates similar to 113,000 hits if searched with the "bug prediction" phrase. Despite this staggering effort by the research community, bug prediction research is criticized for not being decisively adopted in practice. A significant problem of the existing research is the granularity level (i.e., class/file level) at which bug prediction is historically studied. Practitioners find it difficult and time-consuming to locate bugs at the class/file level granularity. Consequently, method-level bug prediction has become popular in the past decade. We ask, are these method-level bug prediction models ready for industry use? Unfortunately, the answer is no. The reported high accuracies of thesemodels dwindle significantly if we evaluate them in different realistic time-sensitive contexts. It may seem hopeless at first, but, encouragingly, we show that future method-level bug prediction can be improved significantly. In general, we show how to reliably evaluate future method-level bug prediction models and how to improve them by focusing on four different improvement avenues: building noise-free bug data, addressing concept drift, selecting similar training projects, and developing a mixture of models. Our findings are based on three publicly available method-level bug datasets and a newly built bug dataset of 774, 051 Java methods originating from 49 open-source software projects.
引用
收藏
页数:31
相关论文
共 124 条
[1]  
Ahmad Syed Ishtiaque, 2021, Master's Thesis
[2]  
Alfadel M, 2018, IEEE GCC CONF EXHIB, P761
[3]   Predicting Software Maintainability in Object-Oriented Systems Using Ensemble Techniques [J].
Alsolai, Hadeel ;
Roper, Marc ;
Nassar, Dua' .
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, :716-721
[4]  
Alves TL, 2010, PROC IEEE INT CONF S
[5]  
[Anonymous], 2014, P 22 INT C PROGRAM C, DOI DOI 10.1145/2597008.2597798
[6]  
[Anonymous], 2014, P 11 WORK C MIN SOFT
[7]   Using Bandit Algorithms for Project Selection in Cross-Project Defect Prediction [J].
Asano, Takuya ;
Tsunoda, Masateru ;
Toda, Koji ;
Tahir, Amjed ;
Bennin, Kwabena Ebo ;
Nakasai, Keitaro ;
Monden, Akito ;
Matsumoto, Kenichi .
2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, :649-653
[8]   Empirical Analysis of Data Sampling-Based Ensemble Methods in Software Defect Prediction [J].
Balogun, Abdullateef O. ;
Odejide, Babajide J. ;
Bajeh, Amos O. ;
Alanamu, Zubair O. ;
Usman-Hamza, Fatima E. ;
Adeleke, Hammid O. ;
Mabayoje, Modinat A. ;
Yusuff, Shakirat R. .
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2022 WORKSHOPS, PART V, 2022, 13381 :363-379
[9]   On the time-based conclusion stability of cross-project defect prediction models [J].
Bangash, Abdul Ali ;
Sahar, Hareem ;
Hindle, Abram ;
Ali, Karim .
EMPIRICAL SOFTWARE ENGINEERING, 2020, 25 (06) :5047-5083
[10]   A validation of object-oriented design metrics as quality indicators [J].
Basili, VR ;
Briand, LC ;
Melo, WL .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) :751-761