Tensor Decision Trees for Continual Learning from Drifting Data Streams

被引:0
作者
Krawczyk, Bartosz [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
来源
2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA) | 2021年
关键词
data stream mining; continual learning; concept drift; online learning; decision trees;
D O I
10.1109/DSAA53316.2021.9564150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data stream classification is one of the most vital areas of contemporary machine learning, as many real-life problems generate data continuously and in large volumes. However, most of research in this area focuses on vector-based representations, which are unsuitable for capturing properties of more complex multi-dimensional structures, such as images and video sequences. In this paper, we propose a novel methodology for learning adaptive decision trees from data streams of tensors. We introduce Chordal Kernel Decision Tree for continual learning from tensor data streams. In order to maintain the tensor characteristics, we propose to train and update classifiers in the kernel space designed to work with tensor representation. We use chordal distance to compute similarities between tensors and then apply it as a new feature space in which decision trees are trained. This allows for a direct decision tree induction on tensors. In order to accommodate the streaming and drifting nature of data, we propose a concept drift detection scheme based on tensor representation. It allows us to reconstruct the kernel feature space every time when change is detected. The proposed approach allows for fast and efficient induction of decision trees on streaming data with tensor representation. Experimental study, conducted on 4 real-world and 52 artificial large-scale tensor data streams, shows that using the native tensor feature space leads to more accurate classification than outperforms the vectorized representations.
引用
收藏
页数:2
相关论文
共 4 条
  • [1] Ashfahani A, 2019, Data Min, P666
  • [2] Ensemble learning for data stream analysis: A survey
    Krawczyk, Bartosz
    Minku, Leandro L.
    Gama, Joao
    Stefanowski, Jerzy
    Wozniak, Michal
    [J]. INFORMATION FUSION, 2017, 37 : 132 - 156
  • [3] Decision Trees for Mining Data Streams Based on the McDiarmid's Bound
    Rutkowski, Leszek
    Pietruczuk, Lena
    Duda, Piotr
    Jaworski, Maciej
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (06) : 1272 - 1279
  • [4] Sahoo D, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2660