共 58 条
Software fault prediction using evolving populations with mathematical diversification
被引:2
作者:
Goyal, Somya
[1
]
机构:
[1] Manipal Univ Jaipur, Jaipur 303007, Rajasthan, India
关键词:
Software fault prediction (SFP);
Feature selection (FS);
Search-based software engineering (SBSE);
Genetic evolution (GE);
Mathematical operator algorithm;
Artificial neural network (ANN);
DEFECT PREDICTION;
FEATURE-SELECTION;
OPTIMIZATION;
METRICS;
ALGORITHM;
QUALITY;
D O I:
10.1007/s00500-022-07445-6
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Software fault prediction (SFP) plays a vital role into fostering high quality throughout the software development process. It allows to identify the fault-prone modules in early development phases and facilitates the focused and effective testing over the fault-prone modules. Machine learning (ML)-based classifiers are prominently being used for fault prediction in the software industry. The accuracy of the ML models depends upon the training data and its quality. The curse of high dimensionality adversely impacts the classification power of a ML model. The presence of inter-correlated, insignificant and/or redundant features (or attributes) in the training data hinders the performance of ML classifiers. Feature preprocessing (or feature selection (FS)) is the solution to this issue. Meta-heuristics is the key method to find out the most significant feature subset. In this paper, a novel feature selection method is devised using mathematical diversification for genetic evolution. It avoids the local optimums by utilizing arithmetic diversification among the candidate solutions (or populations). The survival of fittest is the working principle of evolving populations with crossover and mutation operations. The selected feature subset is fed to five classification algorithms, namely artificial neural network, support vector machine, decision tree, k-nearest neighbor and naive Bayes. The proposed model is trained and tested over five datasets from NASA corpus, namely CM1, JM1, KC1, KC2 and PC1. In total, 100 SFP models are implemented (4 feature selection methods x 5 datasets x 5 classification algorithms). From the experiments, it is observed that the SFP models with proposed feature selection technique of evolving populations with mathematical diversification (FS-EPwMD) are better than other models. It can be concluded that the proposed SFP model built using proposed FS-EPwMD with artificial neural networks performs statistically best among all the competing 100 SFP models irrespective of the datasets used.
引用
收藏
页码:13999 / 14020
页数:22
相关论文