Probabilistic exact adaptive random forest for recurrent concepts in data streams

被引:5
|
作者
Wu, Ocean [1 ]
Koh, Yun Sing [1 ]
Dobbie, Gillian [1 ]
Lacombe, Thomas [1 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
关键词
Random forest; Recurring concepts; Concept drift; Data stream; CONCEPT DRIFTS;
D O I
10.1007/s41060-021-00273-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built in the past. The exact technique utilizes pattern matching to find the set of drifted trees that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of accuracy and kappa on both synthetic and real-world datasets.
引用
收藏
页码:17 / 32
页数:16
相关论文
共 50 条
  • [1] Probabilistic exact adaptive random forest for recurrent concepts in data streams
    Ocean Wu
    Yun Sing Koh
    Gillian Dobbie
    Thomas Lacombe
    International Journal of Data Science and Analytics, 2022, 13 : 17 - 32
  • [2] PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
    Wu, Ocean
    Koh, Yun Sing
    Dobbie, Gillian
    Lacombe, Thomas
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 17 - 30
  • [3] Measuring the Effectiveness of Adaptive Random Forest for Handling Concept Drift in Big Data Streams
    AlQabbany, Abdulaziz O.
    Azmi, Aqil M.
    ENTROPY, 2021, 23 (07)
  • [4] Recurrent concepts in data streams classification
    João Gama
    Petr Kosina
    Knowledge and Information Systems, 2014, 40 : 489 - 507
  • [5] Recurrent concepts in data streams classification
    Gama, Joao
    Kosina, Petr
    KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 40 (03) : 489 - 507
  • [6] A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams
    Halstead, Ben
    Koh, Yun Sing
    Riddle, Patricia
    Pechenizkiy, Mykola
    Bifet, Albert
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 407 - 416
  • [7] Unsupervised Context Switch for Classification Tasks on Data Streams with Recurrent Concepts
    dos Reis, Denis M.
    Maletzke, Andre G.
    Batista, Gustavo E. A. P. A.
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 518 - 524
  • [8] Improving the Efficiency of Ensemble Classifier Adaptive Random Forest with Meta Level Learning for Real-Time Data Streams
    Arya, Monika
    Choudhary, Chaitali
    INTELLIGENT COMPUTING AND COMMUNICATION, ICICC 2019, 2020, 1034 : 11 - 21
  • [9] Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information
    Halstead, Ben
    Koh, Yun Sing
    Riddle, Patricia
    Pechenizkiy, Mykola
    Bifet, Albert
    Pears, Russel
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1056 - 1067
  • [10] On Robustness of Adaptive Random Forest Classifier on Biomedical Data Stream
    Fatlawi, Hayder K.
    Kiss, Attila
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 332 - 344