A Tent Levy Flying Sparrow Search Algorithm for Wrapper-Based Feature Selection: A COVID-19 Case Study

被引：4

作者：

Yang, Qinwen ^{[1
]}

Gao, Yuelin ^{[2
]}

Song, Yanjie ^{[3
]}

机构：

[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China

[2] Ningxia Key Lab Intelligent Informat & Big Data Pr, Yinchuan 750021, Peoples R China

[3] Natl Univ Def Technol, Coll Syst Engn, Changsha 410073, Peoples R China

来源：

SYMMETRY-BASEL | 2023年 / 15卷 / 02期

基金：

中国国家自然科学基金;

关键词：

sparrow search algorithm; feature selection; COVID-19; GENETIC ALGORITHM; OPTIMIZATION; EVOLUTIONARY; INFORMATION; DIAGNOSIS;

D O I：

10.3390/sym15020316

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The "Curse of Dimensionality" induced by the rapid development of information science might have a negative impact when dealing with big datasets, and it also makes the problems of symmetry and asymmetry increasingly prominent. Feature selection (FS) can eliminate irrelevant information in big data and improve accuracy. As a recently proposed algorithm, the Sparrow Search Algorithm (SSA) shows its advantages in the FS tasks because of its superior performance. However, SSA is more subject to the population's poor diversity and falls into a local optimum. Regarding this issue, we propose a variant of the SSA called the Tent Levy Flying Sparrow Search Algorithm (TFSSA) to select the best subset of features in the wrapper-based method for classification purposes. After the performance results are evaluated on the CEC2020 test suite, TFSSA is used to select the best feature combination to maximize classification accuracy and simultaneously minimize the number of selected features. To evaluate the proposed TFSSA, we have conducted experiments on twenty-one datasets from the UCI repository to compare with nine algorithms in the literature. Nine metrics are used to evaluate and compare these algorithms' performance properly. Furthermore, the method is also used on the coronavirus disease (COVID-19) dataset, and its classification accuracy and the average number of feature selections are 93.47% and 2.1, respectively, reaching the best. The experimental results and comparison in all datasets demonstrate the effectiveness of our new algorithm, TFSSA, compared with other wrapper-based algorithms.

引用

页数：39

共 108 条

[1] A Review on Evolutionary Feature Selection
Abd-Alsabour, Nadia
[J]. UKSIM-AMSS EIGHTH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2014), 2014, : 20 - 26
[2] No Free Lunch Theorem: A Review
Adam, Stavros P.
Alexandropoulos, Stamatios-Aggelos N.
Pardalos, Panos M.
Vrahatis, Michael N.
[J]. APPROXIMATION AND OPTIMIZATION: ALGORITHMS, COMPLEXITY AND APPLICATIONS, 2019, 145 : 57 - 82
[3] Alasadi S. A., 2017, Journal of Engineering and Applied Sciences, V12, P4102, DOI DOI 10.3923/JEASCI.2017.4102.4107
[4] AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION
ALTMAN, NS
[J]. AMERICAN STATISTICIAN, 1992, 46 (03) : 175 - 185
[5] Binary butterfly optimization approaches for feature selection
Arora, Sankalap
Anand, Priyanka
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 116 : 147 - 160
[6] Asuncion A., 2007, UCI MACHINE LEARNING
[7] Bi Y, 2022, Arxiv, DOI arXiv:2209.06399
[8] White Shark Optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization problems
Braik, Malik
Hammouri, Abdelaziz
Atwan, Jaffar
Al-Betar, Mohammed Azmi A.
Awadallah, Mohammed A.
[J]. KNOWLEDGE-BASED SYSTEMS, 2022, 243
[9] Cao WF, 2020, LECT NOTES COMPUT SC, V12145, P299, DOI 10.1007/978-3-030-53956-6_27
[10] Software defect prediction based on nested-stacking and heterogeneous feature selection
Chen, Li-qiong
Wang, Can
Song, Shi-long
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (04) : 3333 - 3348

← 1 2 3 4 5 6 7 8 9 10 →