Using reinforcement learning to find an optimal set of features

被引:37
作者
Fard, Seyed Mehdi Hazrati [1 ]
Hamzeh, Ali [1 ]
Hashemi, Sattar [1 ]
机构
[1] Shiraz Univ, Dept Elect & Comp Engn, Shiraz, Iran
关键词
Feature selection; Markov decision process; Reinforcement learning; Temporal difference;
D O I
10.1016/j.camwa.2013.06.031
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Identifying the most characterizing features of observed data is critical for minimizing the classification error. Feature selection is the process of identifying a small subset of highly predictive features out of a large set of candidate features. In the literature, many feature selection methods approach the task as a search problem, where each state in the search space is a possible feature subset. In this study, we consider feature selection problem as a reinforcement learning problem in general and use a well-known method, temporal difference, to traverse the state space and select the best subset of features. Specifically, first, we consider the state space as a Markov decision process, and then we introduce an optimal graph search to overcome the complexity of the problem of concern. Since this approach needs a state evaluation paradigm as an aid to traverse the promising regions in the state space, the presence of a low-cost evaluation function is necessary. This method initially explores the lattice of feature sets, and then exploits the obtained experiments. Finally, two methods, based on filters and wrappers, are proposed for the ultimate selection of features. Our empirical evaluation shows that this strategy performs well in comparison with other commonly used feature selection strategies, while maintaining compatibility with all datasets in hand. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1892 / 1904
页数:13
相关论文
共 30 条
[1]  
[Anonymous], 1971, Statistical Methods in Experimental Physics
[2]  
[Anonymous], P 9 INT WORKSH MACH
[3]  
[Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
[4]  
[Anonymous], 2004, Advances in Neural Information Processing Systems
[5]  
[Anonymous], 2000, Pattern Classification
[6]  
[Anonymous], 1997, MACHINE LEARNING, MCGRAW-HILL SCIENCE/ENGINEERING/MATH
[7]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[8]  
Boull'e M., JMLR 08, P1659
[9]  
Breiman L., 2001, Random forests. Machine learning
[10]  
Collobert R., 2002, Technical report