Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive Explanations

被引:27
|
作者
Zivkovic, Tamara [1 ]
Nikolic, Bosko [1 ]
Simic, Vladimir [2 ,3 ]
Pamucar, Dragan [4 ]
Bacanin, Nebojsa [5 ]
机构
[1] Univ Belgrade, Sch Elect Engn, Bulevar Kralja Aleksandra 73, Belgrade 11000, Serbia
[2] Univ Belgrade, Fac Transport & Traff Engn, Vojvode Stepe 305, Belgrade 11000, Serbia
[3] Yuan Ze Univ, Coll Engn, Dept Ind Engn & Management, Yuandong Rd, Taoyuan City 320315, Taiwan
[4] Univ Belgrade, Fac Org Sci, Dept Operat Res & Stat, Jove Ilica 154, Belgrade 11000, Serbia
[5] Singidunum Univ, Fac Informat & Comp, Danijelova 32, Belgrade 11000, Serbia
关键词
Software testing; Software defect prediction; XGBoost; Reptile search algorithm; Metaheuristics optimization;
D O I
10.1016/j.asoc.2023.110659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software testing represents a crucial component of software development, and it is usually making the difference between successful and failed projects. Although it is extremely important, due to the fast pace and short deadlines of contemporary projects it is often neglected or not detailed enough due to the lack of available time, leading to the potential loss of reputation, private users' data, money, and even lives in some circumstances. In such situations, it would be vital to have the option to predict what modules are error-prone according to the collection of software metrics, and to focus testing on them, and that task is a typical classification task. Machine learning models have been frequently employed within a wide range of classification problems with significant success, and this paper proposes eXtreme gradient boosting (XGBoost) model to execute the defect prediction task. A modified variant of the well-known reptile search optimization algorithm has been suggested to carry out the calibrating of the XGBoost hyperparameters. The enhanced algorithm was named HARSA and evaluated on the collection of challenging CEC2019 benchmark functions, where it exhibited excellent performance. Later, the introduced XGBoost model that uses the proposed algorithm has been evaluated on two benchmark software testing datasets, and the simulation outcomes have been compared to other powerful swarm intelligence metaheuristics that were used in the identical experimental environment, where the proposed approach attained superior classification accuracy on both datasets. Finally, Shapley Additive Explanations analysis was conducted to discover the impact of various software metrics on the classification results.<br />& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:32
相关论文
共 50 条
  • [41] Spatial Mapping and Prediction of Groundwater Quality Using Ensemble Learning Models and SHapley Additive exPlanations with Spatial Uncertainty Analysis
    Yang, Shilong
    Luo, Danyuan
    Tan, Jiayao
    Li, Shuyi
    Song, Xiaoqing
    Xiong, Ruihan
    Wang, Jinghan
    Ma, Chuanming
    Xiong, Hanxiang
    WATER, 2024, 16 (17)
  • [42] Improved prediction of soil shear strength using machine learning algorithms: interpretability analysis using SHapley Additive exPlanations
    Ahmad, Mahmood
    Al Zubi, Mohammad
    Almujibah, Hamad
    Sabri, Mohanad Muayad Sabri
    Mustafvi, Jawad Bashir
    Haq, Shay
    Ouahbi, Tariq
    Alzlfawi, Abdullah
    FRONTIERS IN EARTH SCIENCE, 2025, 13
  • [43] Interpretable prediction of thermal sensation for elderly people based on data sampling, machine learning and SHapley Additive exPlanations (SHAP)
    Zheng, Guozhong
    Zhang, Yuqin
    Yue, Xuhui
    Li, Kang
    BUILDING AND ENVIRONMENT, 2023, 242
  • [44] Machine Learning Models Based on Grid-Search Optimization and Shapley Additive Explanations (SHAP) for Early Stroke Prediction
    Al Mamlook, Rabia Emhamed
    Lahwal, Fathia
    Elgeberi, Najat
    Obeidat, Muhammad
    Al-Na'amneh, Qais
    Nasayreh, Ahmad
    Gharaibeh, Hasan
    Gharaibeh, Tasnim
    Bzizi, Hanin
    4TH INTERDISCIPLINARY CONFERENCE ON ELECTRICS AND COMPUTER, INTCEC 2024, 2024,
  • [45] Distributed Photovoltaic Distribution Voltage Prediction Based on eXtreme Gradient Boosting and Time Convolutional Networks
    Yuan, Fang
    Lu, Yong
    Xie, Zhi
    Dai, Shenxiang
    IEEE ACCESS, 2024, 12 : 177576 - 177588
  • [46] Research on Provincial-Level Soil Moisture Prediction Based on Extreme Gradient Boosting Model
    Ren, Yifang
    Ling, Fenghua
    Wang, Yong
    AGRICULTURE-BASEL, 2023, 13 (05):
  • [47] Experimental study and extreme gradient boosting (XGBoost) based prediction of caking ability of coal blends
    Rzychoń, Maciej
    Żogala, Alina
    Róg, Leokadia
    Journal of Analytical and Applied Pyrolysis, 2021, 156
  • [48] Experimental study and extreme gradient boosting (XGBoost) based prediction of caking ability of coal blends
    Rzychon, Maciej
    Zogala, Alina
    Rog, Leokadia
    JOURNAL OF ANALYTICAL AND APPLIED PYROLYSIS, 2021, 156
  • [49] Extreme Gradient Boosting-Based Machine Learning Approach for Green Building Cost Prediction
    Alshboul, Odey
    Shehadeh, Ali
    Almasabha, Ghassan
    Almuflih, Ali Saeed
    SUSTAINABILITY, 2022, 14 (11)
  • [50] Online Prediction and Correction of Static Voltage Stability Index Based on Extreme Gradient Boosting Algorithm
    Qin, Huiling
    Li, Shuang
    Zhang, Juncheng
    Rao, Zhi
    He, Chengyu
    Chen, Zhijun
    Li, Bo
    ENERGIES, 2024, 17 (22)