Q-Learning with Fisher Score for Feature Selection of Large-Scale Data Sets

被引：3

作者：

Gan, Min ^{[1
,2
]}

Zhang, Li ^{[1
,2
,3
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Jiangsu, Peoples R China

[2] Soochow Univ, Joint Int Res Lab Machine Learning & Neuromorph C, Suzhou, Jiangsu, Peoples R China

[3] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou, Jiangsu, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II | 2021年 / 12816卷

关键词：

Feature selection; Fisher score; Q-learning; Large-scale;

D O I：

10.1007/978-3-030-82147-0_25

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Feature selection is to select some useful features from candidate ones, which is one of the main methods for data dimension reduction. Because general feature selection methods are directly performed based on given data sets at hand, it is time-consuming for them to deal with large-scale data sets. To solve this issue, this paper proposes a novel feature selection method, called Q-learning with Fisher score (QLFS), for large-scale data sets. QLFS adopts the framework of Q-learning of reinforcement learning (RL) and takes Fisher score (FS), a filtering method for feature selection, as the internal reward. Here, FS is modified to calculate the ratio of the between-class and within-class distances for a feature subset instead of the ratio for a single feature. By selecting part of the training samples in each episode, QLFS can perform batch learning and then deal with large-scale data sets in batch. Experimental results on several large-scale UCI data sets show that QLFS not only improves the classification performance but also has an advantage of training speed compared with other methods.

引用

页码：306 / 318

页数：13

共 23 条

[1] A survey on feature selection methods [J].