Analyzing and repairing concept drift adaptation in data stream classification

被引:18
|
作者
Halstead, Ben [1 ]
Koh, Yun Sing [1 ]
Riddle, Patricia [1 ]
Pears, Russel [2 ]
Pechenizkiy, Mykola [3 ]
Bifet, Albert [4 ,5 ]
Olivares, Gustavo [6 ]
Coulson, Guy [6 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
[2] Auckland Univ Technol, Auckland, New Zealand
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
[4] Univ Waikato, Hamilton, New Zealand
[5] IP Paris, Telecom Paris, LTCI, Paris, France
[6] Natl Inst Water & Atmospher Res, Auckland, New Zealand
关键词
Concept drift; Data stream classification; Recurring concepts; CLASSIFIERS; SELECTION;
D O I
10.1007/s10994-021-05993-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions.
引用
收藏
页码:3489 / 3523
页数:35
相关论文
共 50 条
  • [31] Concept drift detection on stream data for revising DBSCAN
    Miyata Y.
    Ishikawa H.
    IEEJ Transactions on Electronics, Information and Systems, 2020, 140 (08) : 949 - 955
  • [32] Concept drift detection on stream data for revising DBSCAN
    Miyata, Yasushi
    Ishikawa, Hiroshi
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2021, 104 (01) : 87 - 94
  • [33] Analyzing concept drift and shift from sample data
    Geoffrey I. Webb
    Loong Kuan Lee
    Bart Goethals
    François Petitjean
    Data Mining and Knowledge Discovery, 2018, 32 : 1179 - 1199
  • [34] Analyzing concept drift and shift from sample data
    Webb, Geoffrey I.
    Lee, Loong Kuan
    Goethals, Bart
    Petitjean, Francois
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (05) : 1179 - 1199
  • [35] Efficient Handling of Concept Drift and Concept Evolution over Stream Data
    Haque, Ahsanul
    Khan, Latifur
    Baron, Michael
    Thuraisingham, Bhavani
    Aggarwal, Charu
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 481 - 492
  • [36] Concept drift in Streaming Data Classification: Algorithms, Platforms and Issues
    Janardan, Shikha Mehta
    5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 804 - 811
  • [37] Bhattacharyya distance based concept drift detection method for evolving data stream
    Baidari, Ishwar
    Honnikoll, Nagaraj
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
  • [38] Data stream mining: methods and challenges for handling concept drift
    Scott Wares
    John Isaacs
    Eyad Elyan
    SN Applied Sciences, 2019, 1
  • [39] Multidimensional surrogate stability to detect data stream concept drift
    da Costa, Fausto G.
    Duarte, Felipe S. L. G.
    Vallim, Rosane M. M.
    de Mello, Rodrigo F.
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 87 : 15 - 29
  • [40] Concept Drift Detection in Data Stream Mining : A literature review
    Agrahari, Supriya
    Singh, Anil Kumar
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 9523 - 9540