Cost-Sensitive Learner on Hybrid SMOTE-Ensemble Approach to Predict Software Defects

被引:5
作者
Abuqaddom, Inas [1 ]
Hudaib, Amjad [2 ]
机构
[1] Jordan Univ, King Abdullah II Sch Informat Technol, Dept Comp Sci, Amman, Jordan
[2] Jordan Univ, King Abdullah II Sch Informat Technol, Dept Comp Informat Syst, Amman, Jordan
来源
COMPUTATIONAL AND STATISTICAL METHODS IN INTELLIGENT SYSTEMS | 2019年 / 859卷
关键词
Cost matrix; Cost-sensitive learner; Data mining; Ensemble approaches; Imbalanced dataset; SMOTE; Software defect prediction; Software engineering; QUALITY;
D O I
10.1007/978-3-030-00211-4_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A software defect is a mistake in a computer program or system that causes to have incorrect or unexpected results, or to behave in unintended ways. Machine learning methods are helpful in software defect prediction, even though with the challenge of imbalanced software defect distribution, such that the non-defect modules are much higher than defective modules. In this paper we introduce an enhancement for the most resent hybrid SMOTE-Ensemble approach to deal with software defects problem, utilizing the Cost-Sensitive Learner (CSL) to improve handling imbalanced distribution issue. This paper utilizes four public available datasets of software defects with different imbalanced ratio, and provides comparative performance analysis with the most resent powerful hybrid SMOTE-Ensemble approach to predict software defects. Experimental results show that utilizing multiple machine learning techniques to cope with imbalanced datasets will improve the prediction of software defects. Also, experimental results reveal that cost-sensitive learner performs very well with highly imbalanced datasets than with low imbalanced datasets.
引用
收藏
页码:12 / 21
页数:10
相关论文
共 27 条
  • [1] Abaei Golnoush., 2014, Vietnam Journal of Computer Science, V1, P79, DOI DOI 10.1007/S40595-013-0008-Z
  • [2] Aleem S., 2015, INT J SOFTW ENG APPL, V6
  • [3] Alqatawna J., 2015, Int. J. Commun. Network Syst. Sci, V8, P118
  • [4] Alsawalqah H., 2017, ADV INTELLIGENT SYST, V575
  • [5] Amrieh E.A., 2016, International Journal of Database Theory and Application, V9, P119, DOI [DOI 10.14257/IJDTA.2016.9.8.13, 10.14257/ijdta.2016.9.8.13]
  • [6] [Anonymous], SPRINGER SERIES STAT, DOI DOI 10.1007/978-0-387-69395-8
  • [7] Breiman L., 2001, Mach Learn, V45, P5
  • [8] Buhlmann P., SEM STAT ETH ZUR ZUR
  • [9] Chitraranjan C.D., 2011, IEEE 2011 10 INT C M
  • [10] Predicting defect-prone software modules using support vector machines
    Elish, Karim O.
    Elish, Mahmoud O.
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2008, 81 (05) : 649 - 660