Local modeling approach for cross-project defect prediction

被引:0
作者
Bhat, Nayeem Ahmad [1 ]
Farooq, Sheikh Umar [1 ]
机构
[1] Univ Kashmir, Dept Comp Sci, North Campus, Srinagar, J&K, India
来源
INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS | 2021年 / 15卷 / 04期
关键词
Cross-project defect prediction; local modelling; software quality assurance; training data selection; STATIC CODE ATTRIBUTES; QUALITY;
D O I
10.3233/IDT-210130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prediction approaches used for cross-project defect prediction (CPDP) are usually impractical because of high false alarms, or low detection rate. Instance based data filter techniques that improve the CPDP performance are time-consuming and each time a new test set arrives for prediction the entire filter procedure is repeated. We propose to use local modeling approach for the utilization of ever-increasing cross-project data for CPDP. We cluster the cross-project data, train per cluster prediction models and predict the target test instances using corresponding cluster models. Over 7 NASA Data sets performance comparison using statistical methods between within-project, cross-project, and our local modeling approach were performed. Compared to within-project prediction the cross-project prediction increased the probability of detection (PD) associated with an increase in the probability of false alarm (PF) and decreased overall performance Balance. The application of local modeling decreased the (PF) associated with a decrease in (PD) and an overall performance improvement in terms of Balance. Moreover, compared to one state of the art filter technique - Burak filter, our approach is simple, fast, performance comparable, and opens a new perspective for the utilization of ever-increasing cross-project data for defect prediction. Therefore, when insufficient within-project data is available we recommend training local cluster models than training a single global model on cross-project datasets.
引用
收藏
页码:623 / 637
页数:15
相关论文
共 45 条
[1]   A systematic and comprehensive investigation of methods to build and evaluate fault prediction models [J].
Arisholm, Erik ;
Briand, Lionel C. ;
Johannessen, Eivind B. .
JOURNAL OF SYSTEMS AND SOFTWARE, 2010, 83 (01) :2-17
[2]  
Arthur D., 2006, Proceedings of the Twenty-Second Annual Symposium on Computational Geometry (SCG'06), P144, DOI 10.1145/1137856.1137880
[3]  
Bettenburg N., 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR 2012), P60, DOI 10.1109/MSR.2012.6224300
[4]  
Bhat Nayeem Ahmad, 2020, International Journal of Open Source Software and Processes, V11, P20, DOI 10.4018/IJOSSP.2020070102
[5]   Training data selection for cross-project defection prediction: which approach is better? [J].
Bin, Yi ;
Zhou, Kai ;
Lu, Hongmin ;
Zhou, Yuming ;
Xu, Baowen .
11TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM 2017), 2017, :354-363
[6]   Software defect prediction: do different classifiers find the same defects? [J].
Bowes, David ;
Hall, Tracy ;
Petric, Jean .
SOFTWARE QUALITY JOURNAL, 2018, 26 (02) :525-552
[7]   The use of cross-company fault data for the software fault prediction problem [J].
Catal, Cagatay .
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (05) :3714-3723
[8]  
Charrad M, 2012, J STAT SOFTW
[9]   Negative samples reduction in cross-company software defects prediction [J].
Chen, Lin ;
Fang, Bin ;
Shang, Zhaowei ;
Tang, Yuanyan .
INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 62 :67-77
[10]  
Gray David, 2011, 15th Annual Conference on Evaluation & Assessment in Software Engineering (EASE 2011), P96, DOI 10.1049/ic.2011.0012