Iterative Subset Selection for Feature Drifting Data Streams

被引:8
作者
Yuan, Lanqin [1 ]
Pfahringer, Bernhard [2 ]
Barddal, Jean Paul [3 ]
机构
[1] Univ Waikato, Hamilton, New Zealand
[2] Univ Auckland, Deparment Comp Sci, Auckland, New Zealand
[3] Pontificia Univ Catolica Parana, Programa Posgrad Informat, Curitiba, Parana, Brazil
来源
33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING | 2018年
关键词
Data Stream Mining; Feature Selection; Concept Drift; Embedded Feature Selection; Iterative Subset Selection;
D O I
10.1145/3167132.3167188
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Feature selection has been studied and shown to improve classifier performance in standard batch data mining but is mostly unexplored in data stream mining. Feature selection becomes even more important when the relevant subset of features changes over time, as the underlying concept of a data stream drifts. This specific kind of drift is known as feature drift and requires specific techniques not only to determine which features are the most important but also to take advantage of them. This paper presents a novel method of feature subset selection specialized for dealing with the occurrence of feature drifts called Iterative Subset Selection (ISS), which splits the feature selection process into two stages by first ranking the features, and then iteratively selecting features from the ranking. Applying our feature selection method together with Naive Bayes or k-Nearest Neighbour as a classifier, results in compelling accuracy improvements, compared to prior work.
引用
收藏
页码:510 / 517
页数:8
相关论文
共 50 条
  • [21] Tensor decision trees for continual learning from drifting data streams
    Bartosz Krawczyk
    Machine Learning, 2021, 110 : 3015 - 3035
  • [22] Learning from concept drifting data streams with unlabeled data
    Wu, Xindong
    Li, Peipei
    Hu, Xuegang
    NEUROCOMPUTING, 2012, 92 : 145 - 155
  • [23] Towards an optimal feature subset selection
    Shiba, OA
    Saeed, W
    Sulaiman, MN
    Ahmad, F
    Mamat, A
    SCORED 2003: STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT, PROCEEDINGS: NETWORKING THE FUTURE MIND IN CONVERGENCE TECHNOLOGY, 2003, : 376 - 380
  • [24] Feature subset selection in an ICA space
    Bressan, M
    Vitrià, J
    TOPICS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 2504 : 196 - 206
  • [25] A new approach to feature subset selection
    Liu, DZ
    Feng, ZJ
    Wang, XZ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1822 - 1825
  • [26] Structural XML Classification in Concept Drifting Data Streams
    Dariusz Brzezinski
    Maciej Piernik
    New Generation Computing, 2015, 33 : 345 - 366
  • [27] Structural XML Classification in Concept Drifting Data Streams
    Brzezinski, Dariusz
    Piernik, Maciej
    NEW GENERATION COMPUTING, 2015, 33 (04) : 345 - 366
  • [28] Iterative sparsity score for feature selection and its extension for multimodal data
    Zu, Chen
    Zhu, Linling
    Zhang, Daoqiang
    NEUROCOMPUTING, 2017, 259 : 146 - 153
  • [29] Feature subset selection Filter-Wrapper based on low quality data
    Cadenas, Jose M.
    Carmen Garrido, M.
    Martinez, Raquel
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (16) : 6241 - 6252
  • [30] Intensive Class Imbalance Learning in Drifting Data Streams
    Usman, Muhammad
    Chen, Huanhuan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (05): : 3503 - 3517