Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive Explanations

被引:27
|
作者
Zivkovic, Tamara [1 ]
Nikolic, Bosko [1 ]
Simic, Vladimir [2 ,3 ]
Pamucar, Dragan [4 ]
Bacanin, Nebojsa [5 ]
机构
[1] Univ Belgrade, Sch Elect Engn, Bulevar Kralja Aleksandra 73, Belgrade 11000, Serbia
[2] Univ Belgrade, Fac Transport & Traff Engn, Vojvode Stepe 305, Belgrade 11000, Serbia
[3] Yuan Ze Univ, Coll Engn, Dept Ind Engn & Management, Yuandong Rd, Taoyuan City 320315, Taiwan
[4] Univ Belgrade, Fac Org Sci, Dept Operat Res & Stat, Jove Ilica 154, Belgrade 11000, Serbia
[5] Singidunum Univ, Fac Informat & Comp, Danijelova 32, Belgrade 11000, Serbia
关键词
Software testing; Software defect prediction; XGBoost; Reptile search algorithm; Metaheuristics optimization;
D O I
10.1016/j.asoc.2023.110659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software testing represents a crucial component of software development, and it is usually making the difference between successful and failed projects. Although it is extremely important, due to the fast pace and short deadlines of contemporary projects it is often neglected or not detailed enough due to the lack of available time, leading to the potential loss of reputation, private users' data, money, and even lives in some circumstances. In such situations, it would be vital to have the option to predict what modules are error-prone according to the collection of software metrics, and to focus testing on them, and that task is a typical classification task. Machine learning models have been frequently employed within a wide range of classification problems with significant success, and this paper proposes eXtreme gradient boosting (XGBoost) model to execute the defect prediction task. A modified variant of the well-known reptile search optimization algorithm has been suggested to carry out the calibrating of the XGBoost hyperparameters. The enhanced algorithm was named HARSA and evaluated on the collection of challenging CEC2019 benchmark functions, where it exhibited excellent performance. Later, the introduced XGBoost model that uses the proposed algorithm has been evaluated on two benchmark software testing datasets, and the simulation outcomes have been compared to other powerful swarm intelligence metaheuristics that were used in the identical experimental environment, where the proposed approach attained superior classification accuracy on both datasets. Finally, Shapley Additive Explanations analysis was conducted to discover the impact of various software metrics on the classification results.<br />& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Aydin, Halit Enes
    Iban, Muzaffer Can
    NATURAL HAZARDS, 2023, 116 (03) : 2957 - 2991
  • [22] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Halit Enes Aydin
    Muzaffer Can Iban
    Natural Hazards, 2023, 116 : 2957 - 2991
  • [23] Prediction of HHV of fuel by Machine learning Algorithm: Interpretability analysis using Shapley Additive Explanations (SHAP)
    Timilsina, Manish Sharma
    Sen, Subhadip
    Uprety, Bibek
    Patel, Vashishtha B.
    Sharma, Prateek
    Sheth, Pratik N.
    FUEL, 2024, 357
  • [24] Prediction of HHV of fuel by Machine learning Algorithm: Interpretability analysis using Shapley Additive Explanations (SHAP)
    Timilsina, Manish Sharma
    Sen, Subhadip
    Uprety, Bibek
    Patel, Vashishtha B.
    Sharma, Prateek
    Sheth, Pratik N.
    FUEL, 2024, 357
  • [25] Parametric Analysis for Torque Prediction in Friction Stir Welding Using Machine Learning and Shapley Additive Explanations
    Belalia, Sif Eddine
    Serier, Mohamed
    Al-Sabur, Raheem
    JOURNAL OF COMPUTATIONAL APPLIED MECHANICS, 2024, 55 (01): : 113 - 124
  • [26] Viscosity and melting temperature prediction of mold fluxes based on explainable machine learning and SHapley additive exPlanations
    Yan, Wei
    Shen, Yangyang
    Chen, Shoujie
    Wang, Yongyuan
    JOURNAL OF NON-CRYSTALLINE SOLIDS, 2024, 636
  • [27] Interpretable general thermal comfort model based on physiological data from wearable bio sensors: Light Gradient Boosting Machine (LightGBM) and SHapley Additive exPlanations (SHAP)
    Kim, Hyunsoo
    Lee, Gaang
    Ahn, Hyeunguk
    Choi, Byungjoo
    BUILDING AND ENVIRONMENT, 2024, 266
  • [28] Road surface temperature prediction based on gradient extreme learning machine boosting
    Liu, Bo
    Yan, Shuo
    You, Huanling
    Dong, Yan
    Li, Yong
    Lang, Jianlei
    Gu, Rentao
    COMPUTERS IN INDUSTRY, 2018, 99 : 294 - 302
  • [29] Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations
    Dong, Sheng
    Khattak, Afaq
    Ullah, Irfan
    Zhou, Jibiao
    Hussain, Arshad
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (05)
  • [30] Hybrid machine learning approach to prediction of the compressive and flexural strengths of UHPC and parametric analysis with shapley additive explanations
    Das, Pobithra
    Kashem, Abul
    CASE STUDIES IN CONSTRUCTION MATERIALS, 2024, 20