Two-tier deep and machine learning approach optimized by adaptive multi-population firefly algorithm for software defects prediction

被引:0
作者
Villoth, John Philipose [1 ]
Zivkovic, Miodrag [1 ]
Zivkovic, Tamara [1 ]
Abdel-salam, Mahmoud [2 ]
Hammad, Mohamed [3 ,4 ]
Jovanovic, Luka [5 ]
Simic, Vladimir [6 ,7 ,8 ]
Bacanin, Nebojsa [1 ,9 ,10 ]
机构
[1] Singidunum Univ, Fac Informat & Comp, Danijelova 32, Belgrade 11000, Serbia
[2] Mansoura Univ, Fac Comp & Informat Sci, Mansoura 35516, Egypt
[3] Prince Sultan Univ, Coll Comp & Informat Sci, EIAS Data Sci Lab, Riyadh 11586, Saudi Arabia
[4] Menoufia Univ, Fac Comp & Informat, Dept Informat Technol, Shibin Al Kawm 32511, Egypt
[5] Singidunum Univ, Tech Fac, Danijelova 32, Belgrade 11000, Serbia
[6] Univ Belgrade, Fac Transport & Traff Engn, Vojvode Stepe 305, Belgrade 11010, Serbia
[7] Yuan Ze Univ, Coll Engn, Dept Ind Engn & Management, Yuandong Rd,Zhongli Dis, Taoyuan City 320315, Taiwan
[8] Korea Univ, Coll Informat, Dept Comp Sci & Engn, 145 Anam-ro, Seoul 02841, South Korea
[9] SIMATS, Saveetha Sch Engn, Dept Math, Chennai 602105, Tamilnadu, India
[10] Sinergija Univ, Bijeljina 76300, Bosnia & Herceg
关键词
Natural language processing; Convolutional neural network; Software defect prediction; Metaheuristics optimization; Explainable artificial intelligence;
D O I
10.1016/j.neucom.2025.129695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software plays a progressively crucial role, where automated software systems control essential operations. Since development needs also progressively expand, manual code reviews become increasingly difficult, frequently resulting in testing lasting longer than development itself. An encouraging option for enhancing defect identification within the source code involves combining artificial intelligence and natural language processing (NLP). Analyzing source code offers an efficient approach to enhance defect detection and prevent errors in the code. This study investigates source code analysis using NLP and machine learning, where traditional and contemporary techniques of error detection are evaluated. Metaheuristics algorithms are utilized to tune machine learning classifiers, and an altered variant of the well-known firefly algorithm is proposed as part of this research. A two-tier framework is suggested, consisting of a convolutional neural network (CNN), which handles complex feature spaces, while eXtreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), and categorical boosting (CatBoost) classifiers are employed within the second-tier for improving defect detection. Supplementary simulations employing custom term frequency inverse document frequency encoding are also executed to showcase the capabilities of the suggested framework. In total, seven experiments are carried out with publicly accessible datasets. The accuracy of CNN is 80.6% for the defect prediction task, which is enhanced with the second layer using XGBoost, AdaBoost, and CatBoost to nearly 81.5%. The experiments with the NLP approach exhibit superior outcomes, where XGBoost, AdaBoost, and CatBoost achieve accuracies of 99.6%, 99.7%, and 99.8%, indicating the large potential of the suggested approach in the software testing domain.
引用
收藏
页数:28
相关论文
共 62 条
[1]  
Abdel-Basset M., 2018, Computational intelligence for multimedia big data on the cloud with engineering applications, P185, DOI [10.1016/b978-0-12-813314-9.00010-4, 10.1016/B978-0-12-813314-9.00010-4]
[2]  
Ai J.X.W., 2021, IEEE Trans. Reliab.
[3]   Enhancing software defect prediction: a framework with improved feature selection and ensemble machine learning [J].
Ali, Misbah ;
Mazhar, Tehseen ;
Al-Rasheed, Amal ;
Shahzad, Tariq ;
Ghadi, Yazeed Yasin ;
Khan, Muhammad Amir .
PEERJ COMPUTER SCIENCE, 2024, 10
[4]  
Alyahyan S., 2024, Tsinghua Sci. Technol.
[5]  
[Anonymous], 1977, ELEMENTS SOFTWARE SC
[6]   Respiratory Condition Detection Using Audio Analysis and Convolutional Neural Networks Optimized by Modified Metaheuristics [J].
Bacanin, Nebojsa ;
Jovanovic, Luka ;
Stoean, Ruxandra ;
Stoean, Catalin ;
Zivkovic, Miodrag ;
Antonijevic, Milos ;
Dobrojevic, Milos .
AXIOMS, 2024, 13 (05)
[7]   Improving performance of extreme learning machine for classification challenges by modified firefly algorithm and validation on medical benchmark datasets [J].
Bacanin, Nebojsa ;
Stoean, Catalin ;
Markovic, Dusan ;
Zivkovic, Miodrag ;
Rashid, Tarik A. ;
Chhabra, Amit ;
Sarac, Marko .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (31) :76035-76075
[8]   A Sinh Cosh optimizer [J].
Bai, Jianfu ;
Li, Yifei ;
Zheng, Mingpo ;
Khatir, Samir ;
Benaissa, Brahim ;
Abualigah, Laith ;
Wahab, Magd Abdel .
KNOWLEDGE-BASED SYSTEMS, 2023, 282
[9]  
Briciu Anamaria, 2023, Procedia Computer Science, V225, P1601, DOI 10.1016/j.procs.2023.10.149
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794