An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification

被引:22
作者
Mafarja, Majdi [1 ]
Thaher, Thaer [2 ,3 ]
Too, Jingwei [4 ]
Chantar, Hamouda [5 ]
Turabieh, Hamza [6 ]
Houssein, Essam H. [7 ]
Emam, Marwa M. [7 ]
机构
[1] Birzeit Univ, Dept Comp Sci, Birzeit, Palestine
[2] Arab Amer Univ, Dept Comp Syst Engn, Jenin, Palestine
[3] Al Quds Univ, Informat Technol Engn, Jerusalem, Palestine
[4] Univ Teknikal Malaysia Melaka, Fac Elect Engn, Durian Tunggal 76100, Melaka, Malaysia
[5] Sebha Univ, Fac Informat Technol, Sebha, Libya
[6] Taif Univ, Coll Comp & Informat Technol, Dept Informat Technol, POB 11099, Taif 21944, Saudi Arabia
[7] Minia Univ, Fac Comp & Informat, Al Minya, Egypt
关键词
Feature selection; Binary grey wolf optimizer; Classification; Meta-heuristics; Biological data; PARTICLE SWARM OPTIMIZATION; SUPPORT VECTOR MACHINE; ARTIFICIAL BEE COLONY; DIFFERENTIAL EVOLUTION; HYBRID APPROACH; ALGORITHM;
D O I
10.1007/s00521-022-07836-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biological data generally contain complex and high-dimensional samples. In addition, the number of samples in biological datasets is much fewer than the number of features, so the vast number of features should be selected carefully and determine the optimal subset of features. Feature selection (FS) is a vital stage in biological data mining applications (e.g., classification) for dealing with the curse of dimensionality problems and finding highly informative features. This work proposes an effective FS approach based on a new version of Gray Wolf Optimizer (GWO) called Multi-strategy Gray Wolf Optimizer (MSGWO) for better features selection for biological data classification. The use of MSGWO in feature selection is to find the optimal subset of features between classes, solve premature convergence, and enhance the local search ability of the GWO algorithm. Multiple exploration and exploitation strategies are proposed to enhance the global search and local search abilities of the GWO algorithm through the optimization process. The support vector machine (SVM) classifier is used to evaluate the proposed GWO-based FS approaches. MSGWO was evaluated on thirteen high-dimensional biological datasets obtained from the UCI repository with a smaller number of instances. The reported results confirm that employing multiple exploration and multiple exploitation strategies is highly useful for enhancing the search tendency of the MSGWO in the FS domain. Statistical tests proved that the superiority of the proposed approach is statistically significant as compared to the basic GWO and similar wrapper-based FS techniques, including binary particle swarm optimization (BPSO), binary bat algorithm (BBA), binary gravitational search algorithm (BGSA), and binary whale optimization algorithm (BWOA). In terms of classification accuracy, MSGWO yielded better accuracy rates than the standard GWO algorithm on 84% of applied biological datasets. MSGWO also recorded better accuracy rates than its other competitors in all 13 cases. In terms of the lowest number of selected features, MSGWO yielded excellent reduction rates compared to its peers.
引用
收藏
页码:1749 / 1775
页数:27
相关论文
共 93 条
  • [1] A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection
    Abdel-Basset, Mohamed
    El-Shahat, Doaa
    El-henawy, Ibrahim
    de Albuquerque, Victor Hugo C.
    Mirjalili, Seyedali
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 139
  • [2] Feature Selection Using Salp Swarm Algorithm with Chaos
    Ahmed, Sobhi
    Mafarja, Majdi
    Faris, Hossam
    Aljarah, Ibrahim
    [J]. ISMSI 2018: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, METAHEURISTICS & SWARM INTELLIGENCE, 2018, : 65 - 69
  • [3] A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer
    Alazzam, Hadeel
    Sharieh, Ahmad
    Sabri, Khair Eddin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 148 (148)
  • [4] The curse(s) of dimensionality
    Altman, Naomi
    Krzywinski, Martin
    [J]. NATURE METHODS, 2018, 15 (06) : 399 - 400
  • [5] A hybrid mine blast algorithm for feature selection problems
    Alweshah, Mohammed
    Alkhalaileh, Saleh
    Albashish, Dheeb
    Mafarja, Majdi
    Bsoul, Qusay
    Dorgham, Osama
    [J]. SOFT COMPUTING, 2021, 25 (01) : 517 - 534
  • [6] Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments
    Apolloni, Javier
    Leguizamon, Guillermo
    Alba, Enrique
    [J]. APPLIED SOFT COMPUTING, 2016, 38 : 922 - 932
  • [7] A survey on swarm intelligence approaches to feature selection in data mining
    Bach Hoai Nguyen
    Xue, Bing
    Zhang, Mengjie
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2020, 54
  • [8] Deep learning-based appearance features extraction for automated carp species identification
    Banan, Ashkan
    Nasiri, Amin
    Taheri-Garavand, Amin
    [J]. AQUACULTURAL ENGINEERING, 2020, 89
  • [9] A survey on feature selection methods
    Chandrashekar, Girish
    Sahin, Ferat
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) : 16 - 28
  • [10] Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification
    Chantar, Hamouda
    Mafarja, Majdi
    Alsawalqah, Hamad
    Heidari, Ali Asghar
    Aljarah, Ibrahim
    Faris, Hossam
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (16) : 12201 - 12220