A novel community detection based genetic algorithm for feature selection

被引:102
作者
Rostami, Mehrdad [1 ]
Berahmand, Kamal [2 ]
Forouzandeh, Saman [3 ]
机构
[1] Univ Kurdistan, Dept Comp Engn, Sanandaj, Iran
[2] Queensland Univ Technol, Dept Sci & Engn, Brisbane, Qld, Australia
[3] Univ Appl Sci & Technol, Ctr Tehran Municipal, ICT Org, Dept Comp Engn, Tehran, Iran
关键词
Machine learning; Feature selection; Genetic algorithm; Graph theory; Multi-objective; PARTICLE SWARM OPTIMIZATION; MUTUAL INFORMATION; CLASSIFICATION; SCHEME;
D O I
10.1186/s40537-020-00398-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The feature selection is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well as highly associated redundant features. In the past several years, a variety of meta-heuristic methods were introduced to eliminate redundant and irrelevant features as much as possible from high-dimensional datasets. Among the main disadvantages of present meta-heuristic based approaches is that they are often neglecting the correlation between a set of selected features. In this article, for the purpose of feature selection, the authors propose a genetic algorithm based on community detection, which functions in three steps. The feature similarities are calculated in the first step. The features are classified by community detection algorithms into clusters throughout the second step. In the third step, features are picked by a genetic algorithm with a new community-based repair operation. Nine benchmark classification problems were analyzed in terms of the performance of the presented approach. Also, the authors have compared the efficiency of the proposed approach with the findings from four available algorithms for feature selection. Comparing the performance of the proposed method with three new feature selection methods based on PSO, ACO, and ABC algorithms on three classifiers showed that the accuracy of the proposed method is on average 0.52% higher than the PSO, 1.20% higher than ACO, and 1.57 higher than the ABC algorithm.
引用
收藏
页数:27
相关论文
共 83 条
[71]   Multi-objective feature selection based on artificial bee colony: An acceleration approach with variable sample size [J].
Wang Xiao-han ;
Zhang Yong ;
Sun Xiao-yan ;
Wang Yong-li ;
Du Chang-he .
APPLIED SOFT COMPUTING, 2020, 88
[72]   Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy [J].
Welikala, R. A. ;
Fraz, M. M. ;
Dehmeshki, J. ;
Hoppe, A. ;
Tah, V. ;
Mann, S. ;
Williamson, T. H. ;
Barman, S. A. .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :64-77
[73]   Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach [J].
Xue, Bing ;
Zhang, Mengjie ;
Browne, Will N. .
IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (06) :1656-1671
[74]   Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers [J].
Xue, Yu ;
Tang, Tao ;
Pang, Wei ;
Liu, Alex X. .
APPLIED SOFT COMPUTING, 2020, 88
[75]   A novel hybrid feature selection strategy in quantitative analysis of laser-induced breakdown spectroscopy [J].
Yan, Chunhua ;
Liang, Jing ;
Zhao, Mingjing ;
Zhang, Xin ;
Zhang, Tianlong ;
Li, Hua .
ANALYTICA CHIMICA ACTA, 2019, 1080 :35-42
[76]   Cost-sensitive and sequential feature selection for chiller fault detection and diagnosis [J].
Yan, Ke ;
Ma, Lulu ;
Dai, Yuting ;
Shen, Wen ;
Ji, Zhiwei ;
Xie, Dongqing .
INTERNATIONAL JOURNAL OF REFRIGERATION, 2018, 86 :401-409
[77]   An improved genetic algorithm for optimal feature subset selection from multi-character feature set [J].
Yang, Wenzhu ;
Li, Daoliang ;
Zhu, Liang .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) :2733-2740
[78]  
Yazdi Kasra Majbouri, 2019, 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), P301, DOI 10.1109/PDCAT46702.2019.00062
[79]   Prediction optimization of diffusion paths in social networks using integration of ant colony and densest subgraph algorithms [J].
Yazdi, Kasra Majbouri ;
Yazdi, Adel Majbouri ;
Khodayi, Saeid ;
Hou, Jingyu ;
Zhou, Wanlei ;
Saedy, Saeed ;
Rostami, Mehrdad .
JOURNAL OF HIGH SPEED NETWORKS, 2020, 26 (02) :141-153
[80]   Swarm intelligence applied in green logistics: A literature review [J].
Zhang, Shuzhu ;
Lee, C. K. M. ;
Chan, H. K. ;
Choy, K. L. ;
Wu, Zhang .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 37 :154-169