Evolutionary computation for feature selection in classification problems

被引:47
作者
de la Iglesia, Beatriz [1 ]
机构
[1] Univ E Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
关键词
FEATURE SUBSET-SELECTION; EFFICIENT FEATURE-SELECTION; GENETIC ALGORITHM; MEMETIC ALGORITHMS; OPTIMIZATION; GA; ACO; METAHEURISTICS; SYSTEM; COLONY;
D O I
10.1002/widm.1106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature subset selection (FSS) has received a great deal of attention in statistics, machine learning, and data mining. Real world data analyzed by data mining algorithms can involve a large number of redundant or irrelevant features or simply too many features for a learning algorithm to handle them efficiently. Feature selection is becoming essential as databases grow in size and complexity. The selection process is expected to bring benefits in terms of better performing models, computational efficiency, and simpler more understandable models. Evolutionary computation (EC) encompasses a number of naturally inspired techniques such as genetic algorithms, genetic programming, ant colony optimization, or particle swarm optimization algorithms. Such techniques are well suited to feature selection because the representation of a feature subset is straightforward and the evaluation can also be easily accomplished through the use of wrapper or filter algorithms. Furthermore, the capability of such heuristic algorithms to efficiently search large search spaces is of great advantage to the feature selection problem. Here, we review the use of different EC paradigms for feature selection in classification problems. We discuss details of each implementation including representation, evaluation, and validation. The review enables us to uncover the best EC algorithms for FSS and to point at future research directions. (C) 2013 John Wiley & Sons, Ltd.
引用
收藏
页码:381 / 407
页数:27
相关论文
共 118 条
[91]  
Richeldi M, 1996, P ICML WORKSH EV COM
[92]  
Ruiz R, 2008, JMLR WORKSH C P NEW, V4, P148
[93]   Feature selection for splice site prediction:: A new method using EDA-based feature ranking -: art. no. 64 [J].
Saeys, Y ;
Degroeve, S ;
Aeyels, D ;
Rouzé, P ;
Van de Peer, Y .
BMC BIOINFORMATICS, 2004, 5 (1)
[94]   A review of feature selection techniques in bioinformatics [J].
Saeys, Yvan ;
Inza, Inaki ;
Larranaga, Pedro .
BIOINFORMATICS, 2007, 23 (19) :2507-2517
[95]   A Novel Feature Selection Algorithm using Particle Swarm Optimization for Cancer Microarray Data [J].
Sahu, Barnali ;
Mishra, Debahuti .
INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 :27-31
[96]  
Salcedo-Sanz S, 2002, LECT NOTES COMPUT SC, V2415, P547
[97]  
Shanmugapriya D, 2011, INT J COMPUT SCI NET, V11, P191
[98]  
Siedlecki W., 1988, International Journal of Pattern Recognition and Artificial Intelligence, V2, P197, DOI 10.1142/S0218001488000145
[99]   A NOTE ON GENETIC ALGORITHMS FOR LARGE-SCALE FEATURE-SELECTION [J].
SIEDLECKI, W ;
SKLANSKY, J .
PATTERN RECOGNITION LETTERS, 1989, 10 (05) :335-347
[100]   Framework for efficient feature selection in genetic algorithm based data mining [J].
Sikora, Riyaz ;
Piramuthu, Selwyn .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 180 (02) :723-737