A New Approach for Wrapper Feature Selection Using Genetic Algorithm for Big Data

被引:11
作者
Bouaguel, Waad [1 ]
机构
[1] Univ Tunis, LARODEC, ISG, Tunis, Tunisia
来源
INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2015 | 2016年 / 5卷
关键词
Wrapper; Feature selection; Big data; CLASSIFICATION; PREDICTION;
D O I
10.1007/978-3-319-27000-5_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increased dimensionality of genomic and proteomic data produced by microarray and mass spectrometry technology makes testing and training of general classification method difficult. Special data analysis is demanded in this case and one of the common ways to handle high dimensionality is identification of the most relevant features in the data. Wrapper feature selection is one of the most common and effective techniques for feature selection. Although efficient, wrapper methods have some limitations due to the fact that their result depends on the search strategy. In theory when a complex search is used, it may take much longer to choose the best subset of features and may be impractical in some cases. Hence we propose a new wrapper feature selection for big data based on a random search using genetic algorithm and prior information. The new approach was tested on 2 biological dataset and compared to two well known wrapper feature selection approaches and results illustrate that our approach gives the best performances.
引用
收藏
页码:75 / 83
页数:9
相关论文
共 17 条
[1]  
Al-Ani A, 2001, ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, P477, DOI 10.1109/ISSPA.2001.950184
[2]  
Ben Brahim A., 2014, COMBINING FEATURE SE, P517
[3]  
Bonev B, 2010, THESIS
[4]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[5]  
Holland I.H., 1975, ADAPTATION NATURAL A
[6]  
Karegowda A.G., 2010, International journal of Computer applications, V1, P13
[7]  
Kumar G, 2011, CAN CON EL COMP EN, P395, DOI 10.1109/CCECE.2011.6030480
[8]   Toward integrating feature selection algorithms for classification and clustering [J].
Liu, H ;
Yu, L .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) :491-502
[9]  
Liu H., SPRINGER INT SERIES, V453
[10]  
Martinez H. P., 2010, P 3 INT WORKSH AFF I, P15