Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review

被引:66
|
作者
Matloob, Faseeha [1 ]
Ghazal, Taher M. [2 ,3 ]
Taleb, Nasser [4 ]
Aftab, Shabib [1 ,5 ]
Ahmad, Munir [5 ]
Khan, Muhammad Adnan [6 ]
Abbas, Sagheer [5 ]
Soomro, Tariq Rahim [7 ]
机构
[1] Virtual Univ Pakistan, Dept Comp Sci, Lahore 44000, Pakistan
[2] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Ctr Cyber Secur, Bangi 43600, Selangor, Malaysia
[3] Univ City Sharjah, Skyline Univ Coll, Sch Informat Technol, Sharjah, U Arab Emirates
[4] Canadian Univ Dubai, Fac Management, Dubai, U Arab Emirates
[5] Natl Coll Business Adm & Econ, Sch Comp Sci, Lahore 54660, Pakistan
[6] Gachon Univ, Dept Software, Pattern Recognit & Machine Learning Lab, Seongnam 13557, South Korea
[7] Inst Business Management, CCSIS, Karachi 75190, Sindh, Pakistan
关键词
Software; Systematics; Data mining; Tools; Predictive models; Machine learning algorithms; Bibliographies; Systematic literature review (SLR); ensemble classifier; hybrid classifier; software defect prediction; FOREST;
D O I
10.1109/ACCESS.2021.3095559
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in the domain of software defect prediction (SDP) include the integration of multiple classification techniques to create an ensemble or hybrid approach. This technique was introduced to improve the prediction performance by overcoming the limitations of any single classification technique. This research provides a systematic literature review on the use of the ensemble learning approach for software defect prediction. The review is conducted after critically analyzing research papers published since 2012 in four well-known online libraries: ACM, IEEE, Springer Link, and Science Direct. In this study, five research questions covering the different aspects of research progress on the use of ensemble learning for software defect prediction are addressed. To extract the answers to identified questions, 46 most relevant papers are shortlisted after a thorough systematic research process. This study will provide compact information regarding the latest trends and advances in ensemble learning for software defect prediction and provide a baseline for future innovations and further reviews. Through our study, we discovered that frequently employed ensemble methods by researchers are the random forest, boosting, and bagging. Less frequently employed methods include stacking, voting and Extra Trees. Researchers proposed many promising frameworks, such as EMKCA, SMOTE-Ensemble, MKEL, SDAEsTSE, TLEL, and LRCR, using ensemble learning methods. The AUC, accuracy, F-measure, Recall, Precision, and MCC were mostly utilized to measure the prediction performance of models. WEKA was widely adopted as a platform for machine learning. Many researchers showed through empirical analysis that features selection, and data sampling was necessary pre-processing steps that improve the performance of ensemble classifiers.
引用
收藏
页码:98754 / 98771
页数:18
相关论文
共 50 条
  • [41] Data and Ensemble Machine Learning Fusion Based Intelligent Software Defect Prediction System
    Abbas, Sagheer
    Aftab, Shabib
    Khan, Muhammad Adnan
    Ghazal, Taher M.
    Al Hamadi, Hussam
    Yeun, Chan Yeob
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 6083 - 6100
  • [42] Imbalanced Classification Methods for Student Grade Prediction: A Systematic Literature Review
    Bujang, Siti Dianah Abdul
    Selamat, Ali
    Krejcar, Ondrej
    Mohamed, Farhan
    Cheng, Lim Kok
    Chiu, Po Chan
    Fujita, Hamido
    IEEE ACCESS, 2023, 11 : 1970 - 1989
  • [43] Software Defect Prediction Using Heterogeneous Ensemble Classification Based on Segmented Patterns
    Alsawalqah, Hamad
    Hijazi, Neveen
    Eshtay, Mohammed
    Faris, Hossam
    Al Radaideh, Ahmed
    Aljarah, Ibrahim
    Alshamaileh, Yazan
    APPLIED SCIENCES-BASEL, 2020, 10 (05):
  • [44] Neighbor cleaning learning based cost-sensitive ensemble learning approach for software defect prediction
    Li, Li
    Su, Renjia
    Zhao, Xin
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (12)
  • [45] Software Defect Prediction Analysis Using Machine Learning Techniques
    Khalid, Aimen
    Badshah, Gran
    Ayub, Nasir
    Shiraz, Muhammad
    Ghouse, Mohamed
    SUSTAINABILITY, 2023, 15 (06)
  • [46] Comparing Methods for Large-Scale Agile Software Development: A Systematic Literature Review
    Edison, Henry
    Wang, Xiaofeng
    Conboy, Kieran
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2709 - 2731
  • [47] On the Value of Oversampling for Deep Learning in Software Defect Prediction
    Yedida, Rahul
    Menzies, Tim
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 3103 - 3116
  • [48] Bootstrap aggregation ensemble learning-based reliable approach for software defect prediction by using characterized code feature
    Suresh Kumar, P.
    Behera, H. S.
    Nayak, Janmenjoy
    Naik, Bighnaraj
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2021, 17 (04) : 355 - 379
  • [49] A Novel Imbalanced Ensemble Learning in Software Defect Predication
    Zheng, Jianming
    Wang, Xingqi
    Wei, Dan
    Chen, Bin
    Shao, Yanli
    IEEE ACCESS, 2021, 9 : 86855 - 86868
  • [50] Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction
    Chakraborty, Tanujit
    Chakraborty, Ashis Kumar
    IEEE TRANSACTIONS ON RELIABILITY, 2021, 70 (02) : 481 - 494