A Comparison of Semi-Supervised Classification Approaches for Software Defect Prediction

被引:27
作者
Catal, Cagatay [1 ]
机构
[1] Istanbul Kultur Univ, Dept Comp Engn, TR-34156 Istanbul, Turkey
关键词
Defect prediction; expectation-maximization; low-density separation; quality estimation; semi-supervised classification; support vector machines;
D O I
10.1515/jisys-2013-0030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting the defect-prone modules when the previous defect labels of modules are limited is a challenging problem encountered in the software industry. Supervised classification approaches cannot build high-performance prediction models with few defect data, leading to the need for new methods, techniques, and tools. One solution is to combine labeled data points with unlabeled data points during learning phase. Semi-supervised classification methods use not only labeled data points but also unlabeled ones to improve the generalization capability. In this study, we evaluated four semi-supervised classification methods for semi-supervised defect prediction. Low-density separation (LDS), support vector machine (SVM), expectation-maximization (EM-SEMI), and class mass normalization (CMN) methods have been investigated on NASA data sets, which are CM1, KC1, KC2, and PC1. Experimental results showed that SVM and LDS algorithms outperform CMN and EM-SEMI algorithms. In addition, LDS algorithm performs much better than SVM when the data set is large. In this study, the LDS-based prediction approach is suggested for software defect prediction when there are limited fault data.
引用
收藏
页码:75 / 82
页数:8
相关论文
共 21 条
[1]  
Boser B. E., 1992, COLT, V1992, P144
[2]   Software fault prediction: A literature review and current trends [J].
Catal, Cagatay .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (04) :4626-4636
[3]   Unlabelled extra data do not always mean extra performance for semi-supervised fault prediction [J].
Catal, Cagatay ;
Diri, Banu .
EXPERT SYSTEMS, 2009, 26 (05) :458-471
[4]   A systematic review of software fault prediction studies [J].
Catal, Cagatay ;
Diri, Banu .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) :7346-7354
[5]  
Chapelle O., 2005, INT WORKSHOP ARTIFIC, V2005, P57
[6]  
Cukic Bojan, 2011, P 7 INT C PRED MOD S, P15
[7]   ADVANCES IN SOFTWARE INSPECTIONS [J].
FAGAN, ME .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1986, 12 (07) :744-751
[8]   Software Defect Detection with Rocus [J].
Jiang, Yuan ;
Li, Ming ;
Zhou, Zhi-Hua .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2011, 26 (02) :328-342
[9]  
JOACHIMS T, 1999, MAKING LARGE SCALE S
[10]  
Joachims Thorsten, 2002, THESIS