Machine learning based methods for software fault prediction: A survey

被引：68

作者：

Pandey, Sushant Kumar ^{[1
]}

Mishra, Ravi Bhushan ^{[1
]}

Tripathi, Anil Kumar ^{[1
]}

机构：

[1] Indian Inst Technol BHU, Dept Comp Sci & Engn, Varanasi, Uttar Pradesh, India

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2021年 / 172卷

关键词：

Machine learning; Fault proneness; Statistical techniques; Fault prediction; Systematic literature review; DEFECT PREDICTION; EMPIRICAL-ANALYSIS; FEATURE-SELECTION; MODEL; QUALITY; METRICS; CLASSIFICATION; PRONENESS; FRAMEWORK; REGRESSION;

D O I：

10.1016/j.eswa.2021.114595

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Several prediction approaches are contained in the arena of software engineering such as prediction of effort, security, quality, fault, cost, and re-usability. All these prediction approaches are still in the rudimentary phase. Experiments and research are conducting to build a robust model. Software Fault Prediction (SFP) is the process to develop the model which can be utilized by software practitioners to detect faulty classes/module before the testing phase. Prediction of defective modules before the testing phase will help the software development team leader to allocate resources more optimally and it reduces the testing effort. In this article, we present a Systematic Literature Review (SLR) of various studies from 1990 to June 2019 towards applying machine learning and statistical method over software fault prediction. We have cited 208 research articles, in which we studied 154 relevant articles. We investigated the competence of machine learning in existing datasets and research projects. To the best of our knowledge, the existing SLR considered only a few parameters over SFP?s performance, and they partially examined the various threats and challenges of SFP techniques. In this article, we aggregated those parameters and analyzed them accordingly, and we also illustrate the different challenges in the SFP domain. We also compared the performance between machine learning and statistical techniques based on SFP models. Our empirical study and analysis demonstrate that the prediction ability of machine learning techniques for classifying class/module as fault/non-fault prone is better than classical statistical models. The performance of machine learning-based SFP methods over fault susceptibility is better than conventional statistical purposes. The empirical evidence of our survey reports that the machine learning techniques have the capability, which can be used to identify fault proneness, and able to form well-generalized result. We have also investigated a few challenges in fault prediction discipline, i.e., quality of data, over-fitting of models, and class imbalance problem. We have also summarized 154 articles in a tabular form for quick identification.

引用

页数：35

共 50 条

[41] Combining feature selection, feature learning and ensemble learning for software fault prediction
Hung Duy Tran
Le Thi My Hanh
Nguyen Thanh Binh
PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 78 - 85
[42] FeatureSelect: a software for feature selection based on machine learning approaches
Masoudi-Sobhanzadeh, Yosef
Motieghader, Habib
Masoudi-Nejad, Ali
BMC BIOINFORMATICS, 2019, 20 (1)
[43] FeatureSelect: a software for feature selection based on machine learning approaches
Yosef Masoudi-Sobhanzadeh
Habib Motieghader
Ali Masoudi-Nejad
BMC Bioinformatics, 20
[44] An experimental methodology to evaluate machine learning methods for fault diagnosis based on vibration signals
Rauber, Thomas Walter
Loca, Antonio Luiz da Silva
Boldt, Francisco de Assis
Rodrigues, Alexandre Loureiros
Varejao, Flavio Miguel
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
[45] A hybrid approach to software fault prediction using genetic programming and ensemble learning methods
Satya Prakash Sahu
B. Ramachandra Reddy
Dev Mukherjee
D. M. Shyamla
Bhim Singh Verma
International Journal of System Assurance Engineering and Management, 2022, 13 : 1746 - 1760
[46] An experimental study for software quality prediction with machine learning methods
Ceran, A. Ayberk
Tanriover, O. Ozgur
2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 93 - 96
[47] ACO based comprehensive model for software fault prediction
Singh, Pradeep
Verma, Shrish
INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2020, 24 (01) : 63 - 71
[48] A survey of machine learning techniques for food sales prediction
Tsoumakas, Grigorios
ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (01) : 441 - 447
[49] Data quality issues in software fault prediction: a systematic literature review
Bhandari, Kirti
Kumar, Kuldeep
Sangal, Amrit Lal
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (08) : 7839 - 7908
[50] Comparative analysis of software fault prediction using various categories of classifiers
Kaur, Inderpreet
Kaur, Arvinder
INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021, 12 (03) : 520 - 535

← 1 2 3 4 5 →