A code change-oriented approach to just-in-time defect prediction with multiple input semantic fusion

被引:0
作者
Huang, Teng [1 ]
Yu, Hui-Qun [1 ]
Fan, Gui-Sheng [1 ]
Huang, Zi-Jie [1 ]
Wu, Chen-Yu [1 ]
机构
[1] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai, Peoples R China
基金
上海市自然科学基金;
关键词
deep learning; defect prediction; just-in-time; software defect;
D O I
10.1111/exsy.13702
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research found that fine-tuning pre-trained models is superior to training models from scratch in just-in-time (JIT) defect prediction. However, existing approaches using pre-trained models have their limitations. First, the input length is constrained by the pre-trained models.Secondly, the inputs are change-agnostic.To address these limitations, we propose JIT-Block, a JIT defect prediction method that combines multiple input semantics using changed block as the fundamental unit. We restructure the JIT-Defects4J dataset used in previous research. We then conducted a comprehensive comparison using eleven performance metrics, including both effort-aware and effort-agnostic measures, against six state-of-the-art baseline models. The results demonstrate that on the JIT defect prediction task, our approach outperforms the baseline models in all six metrics, showing improvements ranging from 1.5% to 800% in effort-agnostic metrics and 0.3% to 57% in effort-aware metrics. For the JIT defect code line localization task, our approach outperforms the baseline models in three out of five metrics, showing improvements of 11% to 140%.
引用
收藏
页数:16
相关论文
共 37 条
[1]  
Britton T., 2013, Tech. Rep
[2]   Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction [J].
Cabral, George G. ;
Minku, Leandro L. ;
Shihab, Emad ;
Mujahid, Suhaib .
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, :666-676
[3]   Software defect number prediction: Unsupervised vs supervised methods [J].
Chen, Xiang ;
Zhang, Dun ;
Zhao, Yingquan ;
Cui, Zhanqi ;
Ni, Chao .
INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 106 :161-181
[4]   MULTI: Multi-objective effort-aware just-in-time software defect prediction [J].
Chen, Xiang ;
Zhao, Yingquan ;
Wang, Qiuping ;
Yuan, Zhidan .
INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 93 :1-13
[5]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[6]   A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes [J].
da Costa, Daniel Alencar ;
McIntosh, Shane ;
Shang, Weiyi ;
Kulesza, Uira ;
Coelho, Roberta ;
Hassan, Ahmed E. .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2017, 43 (07) :641-657
[7]  
Devlin Jacob, 2018, 181004805 ARXIV
[8]  
Feng Zhangyin, 2020, Codebert: A pre-trained model for programming and natural languages
[9]  
Fu W., 2017, REVISITING UNSUPERVI
[10]  
Fukushima T., 2014, EMPIRICAL STUDY JUST