Probabilistic exact adaptive random forest for recurrent concepts in data streams

被引:5
|
作者
Wu, Ocean [1 ]
Koh, Yun Sing [1 ]
Dobbie, Gillian [1 ]
Lacombe, Thomas [1 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
关键词
Random forest; Recurring concepts; Concept drift; Data stream; CONCEPT DRIFTS;
D O I
10.1007/s41060-021-00273-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built in the past. The exact technique utilizes pattern matching to find the set of drifted trees that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of accuracy and kappa on both synthetic and real-world datasets.
引用
收藏
页码:17 / 32
页数:16
相关论文
共 50 条
  • [21] Fuzzy Clustering-Based Adaptive Regression for Drifting Data Streams
    Song, Yiliao
    Lu, Jie
    Lu, Haiyan
    Zhang, Guangquan
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (03) : 544 - 557
  • [22] Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
    Lin, Tzu-Hsuan
    Jiang, Jehn-Ruey
    MATHEMATICS, 2021, 9 (21)
  • [23] Random Ensemble Decision Trees for Learning Concept-Drifting Data Streams
    Li, Peipei
    Wu, Xindong
    Liang, Qianhui
    Hu, Xuegang
    Zhang, Yuhong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 313 - 325
  • [24] Adaptive Spatial Partitioning for Multidimensional Data Streams
    John Hershberger
    Nisheeth Shrivastava
    Subhash Suri
    Csaba D. Toth
    Algorithmica, 2006, 46 : 97 - 117
  • [25] Adaptive Clustering for Dynamic IoT Data Streams
    Puschmann, Daniel
    Barnaghi, Payam
    Tafazolli, Rahim
    IEEE INTERNET OF THINGS JOURNAL, 2017, 4 (01): : 64 - 74
  • [26] Rapidly Labeling and Tracking Dynamically Evolving Concepts In Data Streams
    Parker, Brandon S.
    Khan, Latifur
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 1161 - 1164
  • [27] An online adaptive classifier ensemble for mining non-stationary data streams
    Verdecia-Cabrera, Alberto
    Blanco, Isvani Frias
    Carvalho, Andre C. P. L. F.
    INTELLIGENT DATA ANALYSIS, 2018, 22 (04) : 787 - 806
  • [28] Detection of evolving concepts in non-stationary data streams: A multiple kernel learning approach
    Siahroudi, Sajjad Kamali
    Moodi, Poorya Zare
    Beigy, Hamid
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 91 : 187 - 197
  • [29] SAE: Social Adaptive Ensemble Classifier for Data Streams
    Gomes, Heitor Murilo
    Enembreck, Fabricio
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2013, : 199 - 206
  • [30] MODIFICATION OF RANDOM FOREST BASED APPROACH FOR STREAMING DATA WITH CONCEPT DRIFT
    Zhukov, A. V.
    Sidorov, D. N.
    BULLETIN OF THE SOUTH URAL STATE UNIVERSITY SERIES-MATHEMATICAL MODELLING PROGRAMMING & COMPUTER SOFTWARE, 2016, 9 (04): : 86 - 95