Detecting group concept drift from multiple data streams

被引:37
作者
Yu, Hang [1 ]
Liu, Weixu [1 ]
Lu, Jie [2 ]
Wen, Yimin [3 ]
Luo, Xiangfeng [1 ]
Zhang, Guangquan [2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, 333 Nanchen Rd, Shanghai 200444, Peoples R China
[2] Univ Technol Sydney, Fac Engn & Informat Technol, POB 123, Sydney, NSW 2007, Australia
[3] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, 1, Jinji Rd, Qixing Dist, Guilin 541004, Guangxi, Peoples R China
基金
澳大利亚研究理事会;
关键词
Concept drift; Data streams; Online learning; Hypothesis test; ONLINE;
D O I
10.1016/j.patcog.2022.109113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Concept drift may lead to a sharp downturn in the performance of streaming in data-based algorithms, caused by unforeseeable changes in the underlying distribution of data. In this paper, we are mainly concerned with concept drift across multiple data streams, and in situations where the drift of each data stream cannot be detected in time, due to slight underlying distribution drifts. We call this group concept drift. When compared to the detection of concept drift for a single data stream, the challenges of detecting group concept drift arise from three aspects: first, the training data become more complex; second, the underlying distribution becomes more complex; and third, the correlations between data streams become more complex. To address these challenges, the key idea of our method is to construct a distribution free test statistic, free from any underlying distribution in multiple data streams. Then, for streaming data, we design an online learning algorithm to obtain this test statistic, thereby determining the concept drift caused by the hypothesis test. The experiment evaluations with both synthetic and realworld datasets prove that our method can accurately detect concept drift from multiple data streams.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 40 条
  • [1] Just-in-time adaptive classifiers - Part I: Detecting nonstationary changes
    Alippi, Cesare
    Roveri, Manuel
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07): : 1145 - 1153
  • [2] [Anonymous], 2002, Recent Advances in Computers, Computing and Communications
  • [3] [Anonymous], 2003, P 9 ACM SIGKDD INT C, DOI DOI 10.1145/956750.956813
  • [4] Baena-Garcia M., 2006, P 4 INT WORKSH KNOWL, P77
  • [5] Basseville M., 1993, DETECTION ABRUPT CHA
  • [6] Bifet A, 2010, J MACH LEARN RES, V11, P1601
  • [7] Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443
  • [8] A pdf-Free Change Detection Test Based on Density Difference Estimation
    Bu, Li
    Alippi, Cesare
    Zhao, Dongbin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (02) : 324 - 334
  • [9] Dasu T., 2006, PROC S INTERFACE STA
  • [10] Dries Anton, 2009, DATA SCI J, V2, P311