Frequency based feature selection method using whale algorithm

被引:44
作者
Nematzadeh, Hossein [1 ]
Enayatifar, Rasul [2 ]
Mahmud, Maqsood [3 ]
Akbari, Ebrahim [1 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sari Branch, Sari, Iran
[2] Islamic Azad Univ, Dept Comp Engn, Firoozkooh Branch, Firoozkooh, Iran
[3] Imam Abdulrahman Bin Faisal Universiry IAU, Dept Management Informat Syst, Coll Business Adm, Dammam, Saudi Arabia
关键词
Feature selection; Whale algorithm; Mutual congestion; GENETIC ALGORITHM; HYBRID MODEL;
D O I
10.1016/j.ygeno.2019.01.006
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Feature selection is the problem of finding the best subset of features which have the most impact in predicting class labels. It is noteworthy that application of feature selection is more valuable in high dimensional datasets. In this paper, a filter feature selection method has been proposed on high dimensional binary medical datasets - Colon, Central Nervous System (CNS), GLI_85, SMK_CAN_187. The proposed method incorporates three sections. First, whale algorithm has been used to discard irrelevant features. Second, the rest of features are ranked based on a frequency based heuristic approach called Mutual Congestion. Third, majority voting has been applied on best feature subsets constructed using forward feature selection with threshold tau = 10. This work provides evidence that Mutual Congestion is solely powerful to predict class labels. Furthermore, applying whale algorithm increases the overall accuracy of Mutual Congestion in most of the cases. The findings also show that the proposed method improves the prediction with selecting the less possible features in comparison with state of the arts. https://github.com/hnematzadeh
引用
收藏
页码:1946 / 1955
页数:10
相关论文
共 31 条
[1]  
Aalaei S, 2016, IRAN J BASIC MED SCI, V19, P476
[2]   Distributed feature selection: An application to microarray data classification [J].
Bolon-Canedo, V. ;
Sanchez-Marono, N. ;
Alonso-Betanzos, A. .
APPLIED SOFT COMPUTING, 2015, 30 :136-150
[3]   A review of feature selection methods on synthetic data [J].
Bolon-Canedo, Veronica ;
Sanchez-Marono, Noelia ;
Alonso-Betanzos, Amparo .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (03) :483-519
[4]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[5]   Differential evolution for filter feature selection based on information theory and feature ranking [J].
Hancer, Emrah ;
Xue, Bing ;
Zhang, Mengjie .
KNOWLEDGE-BASED SYSTEMS, 2018, 140 :103-119
[6]  
Hofmann D., FRONT COMPUT NEUROSC
[7]   A threshold fuzzy entropy based feature selection for medical database classification [J].
Jaganathan, P. ;
Kuppuchamy, R. .
COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (12) :2222-2229
[8]   A new wrapper feature selection approach using neural network [J].
Kabir, Md Monirul ;
Islam, Md Monirul ;
Murase, Kazuyuki .
NEUROCOMPUTING, 2010, 73 (16-18) :3273-3283
[9]   An advanced ACO algorithm for feature subset selection [J].
Kashef, Shima ;
Nezamabadi-pour, Hossein .
NEUROCOMPUTING, 2015, 147 :271-279
[10]  
Keogh E., 2017, Curse of Dimensionality, P314, DOI DOI 10.1007/978-1-4899-7687-1192