A Feature Selection-Based K-NN Model for Fast Software Defect Prediction

被引:5
作者
Awotunde, Joseph Bamidele [1 ]
Misra, Sanjay [2 ]
Adeniyi, Abidemi Emmanuel [2 ]
Abiodun, Moses Kazeem [1 ,3 ]
Kaushik, Manju [4 ]
Lawrence, Morolake Oladayo [5 ]
机构
[1] Univ Ilorin, Dept Comp Sci, Ilorin, Nigeria
[2] Ostfold Univ Coll, Dept Comp Sci & Commun, Halden, Norway
[3] Landmark Univ, Dept Comp Sci, Omu Aran, Nigeria
[4] Amity Univ, Amity Inst Informat Technol, Jaipur, Rajasthan, India
[5] Baze Univ, Dept Comp Sci, Abuja, Nigeria
来源
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2022 WORKSHOPS, PART IV | 2022年 / 13380卷
关键词
Software defect prediction; Machine learning; Extreme gradient boost; Feature selection; Prediction; Software development life cycle;
D O I
10.1007/978-3-031-10542-5_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Software Defect Prediction (SDP) is an advanced technological method of predicting software defects in the software development life cycle. Various research works have been previously being done on SDP but the performance of these methods varied from several datasets, hence, making them inconsistent for SDP in the unknown software project. But the hybrid technique using feature selection enabled with machine learning for SDP can be very efficient as it takes the advantage of various methods to come up with better prediction accuracy for a given dataset when compared with an individual classifier. The major issues with individual ML-based models for SDP are the long detection time, vulnerability of the software project, and high dimensionality of the feature parameters. Therefore, this study proposes a hybrid model using a feature selection enabled Extreme Gradient Boost (XGB) classifier to address these mentioned challenges. The cleaned NASA MDP datasets were used for the implementation of the proposed model, and various performance metrics like F-score, accuracy, and MCC were used to reveal the performance of the model. The results of the proposed model when compared with state-of-the-art methods without feature selection perform better in terms of the metrics used. The results reveal that the proposed model outperformed all other prediction techniques.
引用
收藏
页码:49 / 61
页数:13
相关论文
共 50 条
[41]   Cross-Project Software Defect Prediction Based on Feature Selection and Knowledge Distillation [J].
Ling, Songsong ;
Tang, Bin ;
Tao, Ye ;
Hu, Qiang ;
Du, Junwei ;
Yu, Xu .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14866 :137-149
[42]   Internet of Things Enabled Financial Crisis Prediction in Enterprises Using Optimal Feature Subset Selection-Based Classification Model [J].
Metawa, Noura ;
Nguyen, Phong Thanh ;
Nguyen, Quyen Le Hoang Thuy To ;
Elhoseny, Mohamed ;
Shankar, K. .
BIG DATA, 2021, 9 (05) :331-342
[43]   PREDICTION OF TYPE 2 DIABETES MELLITUS USING FEATURE SELECTION-BASED MACHINE LEARNING ALGORITHMS [J].
Yilmaz, Atinc .
HEALTH PROBLEMS OF CIVILIZATION, 2022, 16 (02) :128-139
[44]   SVM with Feature Selection and Extraction Techniques for Defect-Prone Software Module Prediction [J].
Kumar, Raj ;
Singh, Krishna Pratap .
PROCEEDINGS OF SIXTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2016, VOL 2, 2017, 547 :279-289
[45]   Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach [J].
Balogun, Abdullateef Oluwagbemiga ;
Basri, Shuib ;
Abdulkadir, Said Jadid ;
Hashim, Ahmad Sobri .
APPLIED SCIENCES-BASEL, 2019, 9 (13)
[46]   Applying Feature Selection to Software Defect Prediction using Multi-objective Optimization [J].
Chen, Xiang ;
Shen, Yuxiang ;
Cui, Zhanqi ;
Ju, Xiaolin .
2017 IEEE 41ST ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2017, :54-59
[47]   Curious Feature Selection-Based Clustering [J].
Moran M. ;
Gordon G. .
IEEE Transactions on Artificial Intelligence, 2024, 5 (12) :6146-6158
[48]   Capsule feature selector for software defect prediction [J].
Tang, Yu ;
Dai, Qi ;
Du, Ye ;
Zheng, Tian-shuai ;
Li, Mei-hong .
JOURNAL OF SUPERCOMPUTING, 2025, 81 (03)
[49]   A Novel Feature Selection Method Based on Maximum Likelihood Logistic Regression for Imbalanced Learning in Software Defect Prediction [J].
Bashir, Kamal ;
Li, Tianrui ;
Yahaya, Mahama .
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2020, 17 (05) :721-730
[50]   EMPIRICAL ANALYSIS OF THRESHOLD VALUES FOR RANK-BASED FILTER FEATURE SELECTION METHODS IN SOFTWARE DEFECT PREDICTION [J].
Almomani, Malek ;
Balogun, Abdullateef O. ;
Basri, Shuib ;
Imam, Abdullahi A. ;
Alazzawi, Ammar K. ;
Adeyemo, Victor E. ;
Kumar, Ganesh .
JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2023, 18 (01) :187-209