Hybrid SMOTE-Ensemble Approach for Software Defect Prediction

被引:26
作者
Alsawalqah, Hamad [1 ]
Faris, Hossam [1 ]
Aljarah, Ibrahim [1 ]
Alnemer, Loai [1 ]
Alhindawi, Nouh [2 ]
机构
[1] Univ Jordan, King Abdullah Sch Informat Technol 2, Amman, Jordan
[2] Jadara Univ, Fac Sci & Informat Technol, Dept Software Engn, Irbid, Jordan
来源
SOFTWARE ENGINEERING TRENDS AND TECHNIQUES IN INTELLIGENT SYSTEMS, CSOC2017, VOL 3 | 2017年 / 575卷
关键词
Software defect prediction; SMOTE; Ensemble approaches; Data mining; Software engineering; FAULT PREDICTION; QUALITY;
D O I
10.1007/978-3-319-57141-6_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software defect prediction is the process of identifying new defects/bugs in software modules. Software defect presents an error in a computer program, which is caused by incorrect code or incorrect programming logic. As a result, undiscovered defects lead to a poor quality software products. In recent years, software defect prediction has received a considerable amount of attention from researchers. Most of the previous defect detection algorithms are marred by low defect detection ratios. Furthermore, software defect prediction is very challenging problem due to the high imbalanced distribution, where the bug-free codes are much higher than defective ones. In this paper, the software defect prediction problem is formulated as a classification task, and then it examines the impact of several ensembles methods on the classification effectiveness. In addition, the best ensemble classifier will be selected to be trained again on an over-sampled datasets using the Synthetic Minority Over-sampling Technique (SMOTE) algorithm to tackle imbalanced distribution problem. The proposed hybrid method is evaluated using four software defects datasets. Experimental results demonstrate that the proposed method can effectively enhance the defect prediction accuracy.
引用
收藏
页码:355 / 366
页数:12
相关论文
共 28 条
[1]  
Abaei Golnoush., 2014, Vietnam Journal of Computer Science, V1, P79, DOI DOI 10.1007/S40595-013-0008-Z
[2]  
Aljarah I, 2011, P 7 INT C PRED MOD S
[3]  
[Anonymous], 2014, SOFTWARE METRICS RIG
[4]   A systematic and comprehensive investigation of methods to build and evaluate fault prediction models [J].
Arisholm, Erik ;
Briand, Lionel C. ;
Johannessen, Eivind B. .
JOURNAL OF SYSTEMS AND SOFTWARE, 2010, 83 (01) :2-17
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Catal C, 2008, LECT NOTES COMPUT SC, V5089, P244, DOI 10.1007/978-3-540-69566-0_21
[8]  
Catal C, 2007, LECT NOTES COMPUT SC, V4589, P300
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]  
Clark B., 2001, How Good is the Software Review of Defect Prediction Techniques