Enhanced software defect prediction using krill herd algorithm with stacked LSTM with attention mechanism

被引:2
作者
Vasishth, Oshina [1 ]
Bansal, Ankita [1 ]
机构
[1] Netaji Subhas Univ Technol, Dept Informat Technol, Dwarka 110078, Delhi, India
关键词
Software defect prediction; Deep learning; LSTM; NASA promise; NEURAL-NETWORK;
D O I
10.1007/s13198-024-02630-2
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Software defect prediction (SDP) is crucial in software engineering, as undetected defects can lead to significant quality issues, increased maintenance costs, and potential project delays. By accurately identifying defect-prone areas within software systems, SDP helps mitigate these risks, ensuring more reliable software and reducing overall development costs. Numerous studies have aimed to predict defects, primarily by developing machine learning (ML) and deep learning (DL) models. However, these efforts have often overlooked critical aspects such as optimal feature selection, hyperparameter tuning and complex patterns within the data. To address these limitations, this study proposes a novel model based on a Stacked Long Short-Term Memory (LSTM) network with an attention mechanism, designed to enhance the predictive capabilities for software defect prediction (SDP). Feature selection is optimized using the Krill Herd algorithm, while hyperparameter tuning is efficiently managed through the Tree-Structured Parzen Estimator technique. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) is employed, ensuring balanced training datasets. The Stacked LSTM model architecture is designed to capture complex patterns within the data, enhancing the effectiveness of software defect prediction by leveraging deeper insights from sequential information. The model's performance is evaluated using 12 NASA datasets and 38 Apache Promise datasets. The performance of the proposed model is evaluated using several metrics, including the Area Under the ROC Curve (AUC), F-measure, Recall, and Matthews Correlation Coefficient (MCC). When validated over 50 datasets, the proposed model depicted AUC value in the range of 0.829-0.999, MCC in the range of 0.534-0.988, F-Measure in the range of 0.753-0.994 and Recall in the range of 0.734-0.99. The proposed model is also compared against state-of-the-art models from existing studies and recorded highest mean MCC value of 0.879 and mean AUC value of 0.971. The statistical significance of our results, in comparison to these studies, is confirmed using the Scott-Knott test. The findings suggest that this approach is highly effective for SDP, offering superior accuracy and reliability compared to other models.
引用
收藏
页数:21
相关论文
共 61 条
[1]   A parallel hybrid krill herd algorithm for feature selection [J].
Abualigah, Laith ;
Alsalibi, Bisan ;
Shehab, Mohammad ;
Alshinwan, Mohammad ;
Khasawneh, Ahmad M. ;
Alabool, Hamzeh .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (03) :783-806
[2]   Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction [J].
Anh Viet Phan ;
Minh Le Nguyen ;
Lam Thu Bui .
2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, :45-52
[3]   Adaptive recurrent neural network for software defect prediction with the aid of quantum theory- particle swarm optimization [J].
Anju, A. J. ;
Judith, J. E. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (11) :16257-16278
[4]  
Haq MA, 2023, COMPUT SYST SCI ENG, V47, P2689, DOI [10.32604/csse.2023.039904, 10.32604/csse.2023.039904, DOI 10.32604/CSSE.2023.039904]
[5]   Predicting stock market index using LSTM [J].
Bhandari, Hum Nath ;
Rimal, Binod ;
Pokhrel, Nawa Raj ;
Rimal, Ramchandra ;
Dahal, Keshab R. ;
Khatri, Rajendra K. C. .
MACHINE LEARNING WITH APPLICATIONS, 2022, 9
[6]  
Boetticher G., 2007, PROMISE Repository of Empirical Software Engineering Data
[7]   Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques [J].
Borandag, Emin .
APPLIED SCIENCES-BASEL, 2023, 13 (03)
[8]   Software fault prediction: A literature review and current trends [J].
Catal, Cagatay .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (04) :4626-4636
[9]   Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem [J].
Catal, Cagatay ;
Diri, Banu .
INFORMATION SCIENCES, 2009, 179 (08) :1040-1058
[10]   A METRICS SUITE FOR OBJECT-ORIENTED DESIGN [J].
CHIDAMBER, SR ;
KEMERER, CF .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1994, 20 (06) :476-493