Analyzing and repairing concept drift adaptation in data stream classification

被引:18
|
作者
Halstead, Ben [1 ]
Koh, Yun Sing [1 ]
Riddle, Patricia [1 ]
Pears, Russel [2 ]
Pechenizkiy, Mykola [3 ]
Bifet, Albert [4 ,5 ]
Olivares, Gustavo [6 ]
Coulson, Guy [6 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
[2] Auckland Univ Technol, Auckland, New Zealand
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
[4] Univ Waikato, Hamilton, New Zealand
[5] IP Paris, Telecom Paris, LTCI, Paris, France
[6] Natl Inst Water & Atmospher Res, Auckland, New Zealand
关键词
Concept drift; Data stream classification; Recurring concepts; CLASSIFIERS; SELECTION;
D O I
10.1007/s10994-021-05993-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions.
引用
收藏
页码:3489 / 3523
页数:35
相关论文
共 50 条
  • [21] Deterministic Concept Drift Detection in Ensemble Classifier Based Data Stream Classification Process
    Abdualrhman, Mohammed Ahmed Ali
    Padma, M. C.
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (01) : 29 - 48
  • [22] A Novel Weight Adjustment Method for Handling Concept-Drift in Data Stream Classification
    Homeira Shahparast
    Mansoor Zolghadri Jahromi
    Mohammad Taheri
    Sam Hamzeloo
    Arabian Journal for Science and Engineering, 2014, 39 : 799 - 807
  • [23] Heuristic ensemble for unsupervised detection of multiple types of concept drift in data stream classification
    Hu, Hanqing
    Kantardzic, Mehmed
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2021, 15 (04): : 609 - 622
  • [24] G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift
    Liang B.
    Li G.
    Dai C.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (12): : 2844 - 2857
  • [25] Streaming Data Classification with Concept Drift
    Althabiti, Mashail
    Abdullah, Manal
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2019, 12 (01): : 177 - 184
  • [26] Classification of concept drift data streams
    Padmalatha, E.
    Reddy, C. R. K.
    Rani, B. Padmaja
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA), 2014,
  • [27] Combining active learning with concept drift detection for data stream mining
    Krawczyk, Bartosz
    Pfahringer, Bernhard
    Wozniak, Michal
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2239 - 2244
  • [28] A Survey on Concept Drift Adaptation
    Gama, Joao
    Zliobaite, Indre
    Bifet, Albert
    Pechenizkiy, Mykola
    Bouchachia, Abdelhamid
    ACM COMPUTING SURVEYS, 2014, 46 (04)
  • [29] Detection of Concept Drift for Learning from Stream Data
    Lee, Jeonghoon
    Magoules, Frederic
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 241 - 245
  • [30] Detecting concept drift using HEDDM in data stream
    Dongre, Snehlata S.
    Malik, Latesh G.
    Thomas, Achamma
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2019, 7 (2-3) : 164 - 179