Q-Learning with Fisher Score for Feature Selection of Large-Scale Data Sets

被引:3
作者
Gan, Min [1 ,2 ]
Zhang, Li [1 ,2 ,3 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Jiangsu, Peoples R China
[2] Soochow Univ, Joint Int Res Lab Machine Learning & Neuromorph C, Suzhou, Jiangsu, Peoples R China
[3] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou, Jiangsu, Peoples R China
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II | 2021年 / 12816卷
关键词
Feature selection; Fisher score; Q-learning; Large-scale;
D O I
10.1007/978-3-030-82147-0_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is to select some useful features from candidate ones, which is one of the main methods for data dimension reduction. Because general feature selection methods are directly performed based on given data sets at hand, it is time-consuming for them to deal with large-scale data sets. To solve this issue, this paper proposes a novel feature selection method, called Q-learning with Fisher score (QLFS), for large-scale data sets. QLFS adopts the framework of Q-learning of reinforcement learning (RL) and takes Fisher score (FS), a filtering method for feature selection, as the internal reward. Here, FS is modified to calculate the ratio of the between-class and within-class distances for a feature subset instead of the ratio for a single feature. By selecting part of the training samples in each episode, QLFS can perform batch learning and then deal with large-scale data sets in batch. Experimental results on several large-scale UCI data sets show that QLFS not only improves the classification performance but also has an advantage of training speed compared with other methods.
引用
收藏
页码:306 / 318
页数:13
相关论文
共 23 条
[1]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[2]  
Dua D., 2017, UCI machine learning repository
[3]   Optimal resource allocation using reinforcement learning for IoT content-centric services [J].
Gai, Keke ;
Qiu, Meikang .
APPLIED SOFT COMPUTING, 2018, 70 :12-21
[4]   Reinforcement Learning-based Content-Centric Services in Mobile Sensing [J].
Gai, Keke ;
Qiu, Meikang .
IEEE NETWORK, 2018, 32 (04) :34-39
[5]   Iteratively local fisher score for feature selection [J].
Gan, Min ;
Zhang, Li .
APPLIED INTELLIGENCE, 2021, 51 (08) :6167-6181
[6]   Q-LEARNING WITH CENSORED DATA [J].
Goldberg, Yair ;
Kosorok, Michael R. .
ANNALS OF STATISTICS, 2012, 40 (01) :529-560
[7]   Learning and control of exploration primitives [J].
Gordon, Goren ;
Fonio, Ehud ;
Ahissar, Ehud .
JOURNAL OF COMPUTATIONAL NEUROSCIENCE, 2014, 37 (02) :259-280
[8]   Hierarchical curiosity loops and active sensing [J].
Gordon, Goren ;
Ahissar, Ehud .
NEURAL NETWORKS, 2012, 32 :119-129
[9]  
Guyon I., 2003, Journal of Machine Learning Research, V3, P1157, DOI 10.1162/153244303322753616
[10]  
Heck D., 1998, CORSIKA: a Monte Carlo code to simulate extensive air showers