An empirical analysis of the effectiveness of software metrics and fault prediction model for identifying faulty classes

被引：42

作者：

Kumar, Lov ^{[1
]}

Misra, Sanjay ^{[2
]}

Rath, Santanu Ku. ^{[1
]}

机构：

[1] Natl Inst Technol, Dept CSE, Rourkela, India

[2] Atilim Univ, Dept Comp Engn, Ankara, Turkey

来源：

COMPUTER STANDARDS & INTERFACES | 2017年 / 53卷

关键词：

Feature selection techniques; Artificial neural network; Ensemble method; Source code metrics; Cost analysis framework; NEURAL-NETWORK; VALIDATION; QUALITY;

D O I：

10.1016/j.csi.2017.02.003

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Software fault prediction models are used to predict faulty modules at the very early stage of software development life cycle. Predicting fault proneness using source code metrics is an area that has attracted several researchers' attention. The performance of a model to assess fault proneness depends on the source code metrics which are considered as the input for the model. In this work, we have proposed a framework to validate the source code metrics and identify a suitable set of source code metrics with the aim to reduce irrelevant features and improve the performance of the fault prediction model. Initially, we applied a t-test analysis and univariate logistic regression analysis to each source code metric to evaluate their potential for predicting fault proneness. Next, we performed a correlation analysis and multivariate linear regression stepwise forward selection to find the right set of source code metrics for fault prediction. The obtained set of source code metrics are considered as the input to develop a fault prediction model using a neural network with five different training algorithms and three different ensemble methods. The effectiveness of the developed fault prediction models are evaluated using a proposed cost evaluation framework. We performed experiments on fifty six Open Source Java projects. The experimental results reveal that the model developed by considering the selected set of source code metrics using the suggested source code metrics validation framework as the input achieves better results compared to all other metrics. The experimental results also demonstrate that the fault prediction model is best suitable for projects with faulty classes less than the threshold value depending on fault identification efficiency (low - 48.89%, median- 39.26%, and high - 27.86%).

引用

页码：1 / 32

页数：32

共 65 条

[1] An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction
Abaei, Golnoush
Selamat, Ali
Fujita, Hamido
[J]. KNOWLEDGE-BASED SYSTEMS, 2015, 74 : 28 - 39
[2] Abreu F. B., 1994, P 4 INT C SOFTW QUAL, V186, P1
[3] Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: A replicated case study
Aggarwal, K.K.
Singh, Yogesh
Kaur, Arvinder
Malhotra, Ruchika
[J]. Software Process Improvement and Practice, 2009, 14 (01): : 39 - 62
[4] [Anonymous], 1994, P WORKSH PRAGM THEOR
[5] A systematic and comprehensive investigation of methods to build and evaluate fault prediction models
Arisholm, Erik
Briand, Lionel C.
Johannessen, Eivind B.
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2010, 83 (01) : 2 - 17
[6] A hierarchical model for object-oriented design quality assessment
Bansiya, J
Davis, CG
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (01) : 4 - 17
[7] A validation of object-oriented design metrics as quality indicators
Basili, VR
Briand, LC
Melo, WL
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) : 751 - 761
[8] How reuse influences productivity in object-oriented systems
Basili, VR
Briand, LC
Melo, WL
[J]. COMMUNICATIONS OF THE ACM, 1996, 39 (10) : 104 - 116
[9] 1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD
BATTITI, R
[J]. NEURAL COMPUTATION, 1992, 4 (02) : 141 - 166
[10] Bieman J. M., 1995, SIGSOFT Software Engineering Notes, P259, DOI 10.1145/223427.211856

← 1 2 3 4 5 6 7 →