An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification

被引:26
作者
Mafarja, Majdi [1 ]
Thaher, Thaer [2 ,3 ]
Too, Jingwei [4 ]
Chantar, Hamouda [5 ]
Turabieh, Hamza [6 ]
Houssein, Essam H. [7 ]
Emam, Marwa M. [7 ]
机构
[1] Birzeit Univ, Dept Comp Sci, Birzeit, Palestine
[2] Arab Amer Univ, Dept Comp Syst Engn, Jenin, Palestine
[3] Al Quds Univ, Informat Technol Engn, Jerusalem, Palestine
[4] Univ Teknikal Malaysia Melaka, Fac Elect Engn, Durian Tunggal 76100, Melaka, Malaysia
[5] Sebha Univ, Fac Informat Technol, Sebha, Libya
[6] Taif Univ, Coll Comp & Informat Technol, Dept Informat Technol, POB 11099, Taif 21944, Saudi Arabia
[7] Minia Univ, Fac Comp & Informat, Al Minya, Egypt
关键词
Feature selection; Binary grey wolf optimizer; Classification; Meta-heuristics; Biological data; PARTICLE SWARM OPTIMIZATION; SUPPORT VECTOR MACHINE; ARTIFICIAL BEE COLONY; DIFFERENTIAL EVOLUTION; HYBRID APPROACH; ALGORITHM;
D O I
10.1007/s00521-022-07836-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biological data generally contain complex and high-dimensional samples. In addition, the number of samples in biological datasets is much fewer than the number of features, so the vast number of features should be selected carefully and determine the optimal subset of features. Feature selection (FS) is a vital stage in biological data mining applications (e.g., classification) for dealing with the curse of dimensionality problems and finding highly informative features. This work proposes an effective FS approach based on a new version of Gray Wolf Optimizer (GWO) called Multi-strategy Gray Wolf Optimizer (MSGWO) for better features selection for biological data classification. The use of MSGWO in feature selection is to find the optimal subset of features between classes, solve premature convergence, and enhance the local search ability of the GWO algorithm. Multiple exploration and exploitation strategies are proposed to enhance the global search and local search abilities of the GWO algorithm through the optimization process. The support vector machine (SVM) classifier is used to evaluate the proposed GWO-based FS approaches. MSGWO was evaluated on thirteen high-dimensional biological datasets obtained from the UCI repository with a smaller number of instances. The reported results confirm that employing multiple exploration and multiple exploitation strategies is highly useful for enhancing the search tendency of the MSGWO in the FS domain. Statistical tests proved that the superiority of the proposed approach is statistically significant as compared to the basic GWO and similar wrapper-based FS techniques, including binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary gravitational search algorithm (BGSA), and binary whale optimization algorithm (BWOA). In terms of classification accuracy, MSGWO yielded better accuracy rates than the standard GWO algorithm on 84% of applied biological datasets. MSGWO also recorded better accuracy rates than its other competitors in all 13 cases. In terms of the lowest number of selected features, MSGWO yielded excellent reduction rates compared to its peers.
引用
收藏
页码:1749 / 1775
页数:27
相关论文
共 93 条
[1]   A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection [J].
Abdel-Basset, Mohamed ;
El-Shahat, Doaa ;
El-henawy, Ibrahim ;
de Albuquerque, Victor Hugo C. ;
Mirjalili, Seyedali .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 139
[2]   Feature Selection Using Salp Swarm Algorithm with Chaos [J].
Ahmed, Sobhi ;
Mafarja, Majdi ;
Faris, Hossam ;
Aljarah, Ibrahim .
ISMSI 2018: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, METAHEURISTICS & SWARM INTELLIGENCE, 2018, :65-69
[3]   A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer [J].
Alazzam, Hadeel ;
Sharieh, Ahmad ;
Sabri, Khair Eddin .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 148
[4]   The curse(s) of dimensionality [J].
Altman, Naomi ;
Krzywinski, Martin .
NATURE METHODS, 2018, 15 (06) :399-400
[5]   A hybrid mine blast algorithm for feature selection problems [J].
Alweshah, Mohammed ;
Alkhalaileh, Saleh ;
Albashish, Dheeb ;
Mafarja, Majdi ;
Bsoul, Qusay ;
Dorgham, Osama .
SOFT COMPUTING, 2021, 25 (01) :517-534
[6]  
[Anonymous], 2018, Handbook of Research on Emergent Applications of Optimization Algorithms
[7]   Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments [J].
Apolloni, Javier ;
Leguizamon, Guillermo ;
Alba, Enrique .
APPLIED SOFT COMPUTING, 2016, 38 :922-932
[8]   A survey on swarm intelligence approaches to feature selection in data mining [J].
Bach Hoai Nguyen ;
Xue, Bing ;
Zhang, Mengjie .
SWARM AND EVOLUTIONARY COMPUTATION, 2020, 54
[9]   Deep learning-based appearance features extraction for automated carp species identification [J].
Banan, Ashkan ;
Nasiri, Amin ;
Taheri-Garavand, Amin .
AQUACULTURAL ENGINEERING, 2020, 89
[10]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28