Ensemble feature selection using bi-objective genetic algorithm

被引:125
作者
Das, Asit K. [1 ]
Das, Sunanda [2 ]
Ghosh, Arka [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur, India
[2] Neotia Inst Technol Management & Sci, Kolkata, India
关键词
Feature selection; Genetic algorithm; Rough set theory; Mutual information; Evolutionary optimization; Supervised learning; ROUGH SET; OPTIMIZATION; SEARCH;
D O I
10.1016/j.knosys.2017.02.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection problem in data mining is addressed here by proposing a bi-objective genetic algorithm based feature selection method. Boundary region analysis of rough set theory and multivariate mutual information of information theory are used as two objective functions in the proposed work, to select only precise and informative data from the data set. Data set is sampled with replacement strategy and the method is applied to determine non-dominated feature subsets from each sampled data set. Finally, ensemble of such bi-objective genetic algorithm based feature selectors is developed with the help of parallel implementations to produce much generalized feature subset. In fact, individual feature selector outputs are aggregated using a novel dominance based principle to produce final feature subset. Proposed work is validated using repository especially for feature selection datasets as well as on UCI machine learning repository datasets and the experimental results are compared with related state of art feature selection methods to show effectiveness of the proposed ensemble feature selection method. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:116 / 127
页数:12
相关论文
共 62 条
[1]   Toward a gold standard for promoter prediction evaluation [J].
Abeel, Thomas ;
Van de Peer, Yves ;
Saeys, Yvan .
BIOINFORMATICS, 2009, 25 (12) :I313-I320
[2]  
Aho A., DESIGN ANAL COMPUTER
[3]  
[Anonymous], 2005, DATA MINING
[4]  
[Anonymous], 2008, IEEE Intell Inf Bull
[5]   Integration of dense subgraph finding with feature clustering for unsupervised feature selection [J].
Bandyopadhyay, Sanghamitra ;
Bhadra, Tapas ;
Mitra, Pabitra ;
Maulik, Ujjwal .
PATTERN RECOGNITION LETTERS, 2014, 40 :104-112
[7]   Unsupervised feature selection using an improved version of Differential Evolution [J].
Bhadra, Tapas ;
Bandyopadhyay, Sanghamitra .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (08) :4042-4053
[8]  
Cai D., 2010, P 16 ACM SIGKDD INT, P333, DOI DOI 10.1145/1835804.1835848
[9]  
Chakraborty R.C., 2010, AL COURSE LECT, P39
[10]  
Cover TM., 1999, ELEMENTS INFORM THEO, DOI DOI 10.1002/047174882X