Iterative Subset Selection for Feature Drifting Data Streams

被引:9
作者
Yuan, Lanqin [1 ]
Pfahringer, Bernhard [2 ]
Barddal, Jean Paul [3 ]
机构
[1] Univ Waikato, Hamilton, New Zealand
[2] Univ Auckland, Deparment Comp Sci, Auckland, New Zealand
[3] Pontificia Univ Catolica Parana, Programa Posgrad Informat, Curitiba, Parana, Brazil
来源
33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING | 2018年
关键词
Data Stream Mining; Feature Selection; Concept Drift; Embedded Feature Selection; Iterative Subset Selection;
D O I
10.1145/3167132.3167188
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Feature selection has been studied and shown to improve classifier performance in standard batch data mining but is mostly unexplored in data stream mining. Feature selection becomes even more important when the relevant subset of features changes over time, as the underlying concept of a data stream drifts. This specific kind of drift is known as feature drift and requires specific techniques not only to determine which features are the most important but also to take advantage of them. This paper presents a novel method of feature subset selection specialized for dealing with the occurrence of feature drifts called Iterative Subset Selection (ISS), which splits the feature selection process into two stages by first ranking the features, and then iteratively selecting features from the ranking. Applying our feature selection method together with Naive Bayes or k-Nearest Neighbour as a classifier, results in compelling accuracy improvements, compared to prior work.
引用
收藏
页码:510 / 517
页数:8
相关论文
共 50 条
[41]   Feature Subset Selection for Fuzzy Classification Methods [J].
Cintra, Marcos E. ;
Camargo, Heloisa A. .
INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS: THEORY AND METHODS, PT 1, 2010, 80 :318-+
[42]   ONLINE FEATURE SUBSET SELECTION FOR OBJECT TRACKING [J].
Yuan, Jinwei ;
Bastani, Farokh B. .
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, :3253-3257
[43]   Gait Feature Subset Selection by Mutual Information [J].
Guo, Baofeng ;
Nixon, Mark S. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2009, 39 (01) :36-46
[44]   IMPROVED FORWARD FLOATING SELECTION ALGORITHM FOR FEATURE SUBSET SELECTION [J].
Nakariyakul, Songyot ;
Casasent, David P. .
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1 AND 2, 2008, :793-+
[45]   Feature subset selection for classification of histological images [J].
Jelonek, J ;
Stefanowski, J .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 1997, 9 (03) :227-239
[46]   Feature subset selection based on the genetic algorithm [J].
Yang, Jingwei ;
Wang, Sile ;
Chen, Yingyi ;
Lu, Sukui ;
Yang, Wenzhu .
ADVANCED TECHNOLOGIES IN MANUFACTURING, ENGINEERING AND MATERIALS, PTS 1-3, 2013, 774-776 :1532-+
[47]   A hybrid framework for optimal feature subset selection [J].
Shukla, Alok Kumar ;
Singh, Pradeep ;
Vardhan, Manu .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (03) :2247-2259
[48]   Fast orthogonal forward selection algorithm for feature subset selection [J].
Mao, KZ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (05) :1218-1224
[49]   Iterative Laplacian Score for Feature Selection [J].
Zhu, Linling ;
Miao, Linsong ;
Zhang, Daoqiang .
PATTERN RECOGNITION, 2012, 321 :80-87
[50]   A NOx emission prediction hybrid method based on boiler data feature subset selection [J].
Hong Xiao ;
Guanru Huang ;
Guangsi Xiong ;
Wenchao Jiang ;
Hongning Dai .
World Wide Web, 2023, 26 :1811-1825