An Empirical Study on Just-in-time Conformal Defect Prediction

被引:0
作者
Shahini, Xhulja [1 ]
Metzger, Andreas [1 ]
Pohl, Klaus [1 ]
机构
[1] Paluno Univ Duisburg Essen, Essen, Germany
来源
2024 IEEE/ACM 21ST INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR | 2024年
关键词
Defect prediction; quality assurance; conformal prediction; machine learning; deep learning; correctness guarantees; uncertainty;
D O I
10.1145/3643991.3644928
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code changes can introduce defects that affect software quality and reliability. Just-in-time ( JIT) defect prediction techniques provide feedback at check-in time on whether a code change is likely to contain defects. This immediate feedback allows practitioners to make timely decisions regarding potential defects. However, a prediction model may deliver false predictions, that may negatively affect practitioners' decisions. False positive predictions lead to unnecessarily spending resources on investigating clean code changes, while false negative predictions may result in overlooking defective changes. Knowing how uncertain a defect prediction is, would help practitioners to avoid wrong decisions. Previous research in defect prediction explored different approaches to quantify prediction uncertainty for supporting decision-making activities. However, these approaches only offer a heuristic quantification of uncertainty and do not provide guarantees. In this study, we use conformal prediction (CP) as a rigorous uncertainty quantification approach on top of JIT defect predictors. We assess how often CP can provide guarantees for JIT defect predictions. We also assess how many false JIT defect predictions CP can filter out. We experiment with two state-of-the-art JIT defect prediction techniques (DeepJIT and CC2Vec) and two widely used datasets (Qt and OpenStack). Our experiments show that CP can ensure correctness with a 95% probability, for only 27% (for DeepJIT) and 9% (for CC2Vec) of the JIT defect predictions. Additionally, our experiments indicate that CP might be a valuable technique for filtering out the false predictions of JIT defect predictors. CP can filter out up to 100% of false negative predictions and 90% of false positives generated by CC2Vec, and up to 86% of false negative predictions and 83% of false positives generated by DeepJIT.
引用
收藏
页码:88 / 99
页数:12
相关论文
共 69 条
[1]   Conformal Prediction: A Gentle Introduction [J].
Angelopoulos, Anastasios N. ;
Bates, Stephen .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (04) :494-591
[2]  
[Anonymous], 2001, P MACH LEARN ECML 20
[3]   Software defect prediction using cost-sensitive neural network [J].
Arar, Omer Faruk ;
Ayan, Kursat .
APPLIED SOFT COMPUTING, 2015, 33 :263-277
[4]   A soft computing approach for software defect density prediction [J].
Azzeh, Mohammad ;
Alqasrawi, Yousef ;
Elsheikh, Yousef .
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (04)
[5]  
Barnett JG, 2016, 13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), P496, DOI [10.1145/2901739.2903496, 10.1109/MSR.2016.063]
[6]  
Begum M., 2021, Engineering Letters, V29, P44
[7]   Software defect prediction: do different classifiers find the same defects? [J].
Bowes, David ;
Hall, Tracy ;
Petric, Jean .
SOFTWARE QUALITY JOURNAL, 2018, 26 (02) :525-552
[8]  
Chen Ruijie, 2019, Int. J. Perform. Eng., V15, DOI [10.23940/ijpe.19.10.p16.27012708, DOI 10.23940/IJPE.19.10.P16.27012708]
[9]  
Denney Ewen, 2013, ASE 2013, P279, DOI [10.1109/ASE.2013.6693087, DOI 10.1109/ASE.2013.6693087]
[10]   The Impact of Duplicate Changes on Just-in-Time Defect Prediction [J].
Duan, Ruifeng ;
Xu, Haitao ;
Fan, Yuanrui ;
Yan, Meng .
IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (03) :1294-1308