A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection

被引:254
作者
Abdel-Basset, Mohamed [1 ]
El-Shahat, Doaa [2 ]
El-henawy, Ibrahim [2 ]
de Albuquerque, Victor Hugo C. [3 ]
Mirjalili, Seyedali [4 ]
机构
[1] Zagazig Univ, Fac Comp & Informat, Dept Operat Res, Zagazig, Egypt
[2] Zagazig Univ, Fac Comp & Informat, Comp Sci Dept, Zagazig, Egypt
[3] Univ Fortaleza, Fortaleza, Ceara, Brazil
[4] Torrens Univ Australia, 90 Bowen Terrace, Fortitude Valley, Qld 4006, Australia
关键词
Feature selection; Grey wolf optimization algorithm; Wrapper method; Classifier; accuracy; Cross-validation; Mutation; PARTICLE SWARM OPTIMIZATION; CROW SEARCH ALGORITHM; CLASSIFICATION;
D O I
10.1016/j.eswa.2019.112824
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Because of their high dimensionality, dealing with large datasets can hinder the data mining process. Thus, the feature selection is a pre-process mandatory phase for reducing the dimensionality of datasets through using the most informative features and at the same time maximizing the classification accuracy. This paper proposes a new Grey Wolf Optimizer algorithm integrated with a Two-phase Mutation to solve the feature selection for classification problems based on the wrapper methods. The sigmoid function is used to transform the continuous search space to the binary one in order to match the binary nature of the feature selection problem. The two-phase mutation enhances the exploitation capability of the algorithm. The purpose of the first mutation phase is to reduce the number of selected features while preserving high classification accuracy. The purpose of the second mutation phase is to attempt to add more informative features that increase the classification accuracy. As the mutation phase can be time-consuming, the two-phase mutation can be done with a small probability. The wrapper methods can give high-quality solutions so we use one of the most famous wrapper methods which called k-Nearest Neighbor (k-NN) classifier. The Euclidean distance is computed to search for the k-NN. Each dataset is split into training and testing data using K-fold cross-validation to overcome the overfitting problem. Several comparisons with the most famous and modern algorithms such as flower algorithm, particle swarm optimization algorithm, multi-verse optimizer algorithm, whale optimization algorithm, and bat algorithm are done. The experiments are done using 35 datasets. Statistical analyses are made to prove the effectiveness of the proposed algorithm and its outperformance. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 60 条
[51]   A discrete particle swarm optimization method for feature selection in binary classification problems [J].
Unler, Alper ;
Murat, Alper .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2010, 206 (03) :528-539
[52]   Grey Wolf Optimizer to Real Power Dispatch with Non-Linear Constraints [J].
Venkatakrishnan, G. R. ;
Rengaraj, R. ;
Salivahanan, S. .
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2018, 115 (01) :25-45
[53]   A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data [J].
Wang, Hong ;
Jing, Xingjian ;
Niu, Ben .
KNOWLEDGE-BASED SYSTEMS, 2017, 126 :8-19
[54]  
Xin-She Yang, 2012, Unconventional Computation and Natural Computation. Proceedings of the 11th International Conference, UCNC 2012, P240, DOI 10.1007/978-3-642-32894-7_27
[55]   Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms [J].
Xue, Bing ;
Zhang, Mengjie ;
BrowneSchool, Will N. .
APPLIED SOFT COMPUTING, 2014, 18 :261-276
[56]   Efficient feature selection method using real-valued grasshopper optimization algorithm [J].
Zakeri, Arezoo ;
Hokmabadi, Alireza .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 119 :61-72
[57]  
Zawbaa HM, 2015, 2015 11TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), P278, DOI 10.1109/ICENCO.2015.7416362
[58]   Feature selection using firefly optimization for classification and regression models [J].
Zhang, Li ;
Mistry, Kamlesh ;
Lim, Chee Peng ;
Neoh, Siew Chin .
DECISION SUPPORT SYSTEMS, 2018, 106 :64-85
[59]   A return-cost-based binary firefly algorithm for feature selection [J].
Zhang, Yong ;
Song, Xian-fang ;
Gong, Dun-wei .
INFORMATION SCIENCES, 2017, 418 :561-574
[60]   A Novel Hybrid Algorithm for Feature Selection Based on Whale Optimization Algorithm [J].
Zheng, Yuefeng ;
Li, Ying ;
Wang, Gang ;
Chen, Yupeng ;
Xu, Qian ;
Fan, Jiahao ;
Cui, Xueting .
IEEE ACCESS, 2019, 7 :14908-14923