Analyzing and repairing concept drift adaptation in data stream classification

被引:18
|
作者
Halstead, Ben [1 ]
Koh, Yun Sing [1 ]
Riddle, Patricia [1 ]
Pears, Russel [2 ]
Pechenizkiy, Mykola [3 ]
Bifet, Albert [4 ,5 ]
Olivares, Gustavo [6 ]
Coulson, Guy [6 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
[2] Auckland Univ Technol, Auckland, New Zealand
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
[4] Univ Waikato, Hamilton, New Zealand
[5] IP Paris, Telecom Paris, LTCI, Paris, France
[6] Natl Inst Water & Atmospher Res, Auckland, New Zealand
关键词
Concept drift; Data stream classification; Recurring concepts; CLASSIFIERS; SELECTION;
D O I
10.1007/s10994-021-05993-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions.
引用
收藏
页码:3489 / 3523
页数:35
相关论文
共 50 条
  • [1] Analyzing and repairing concept drift adaptation in data stream classification
    Ben Halstead
    Yun Sing Koh
    Patricia Riddle
    Russel Pears
    Mykola Pechenizkiy
    Albert Bifet
    Gustavo Olivares
    Guy Coulson
    Machine Learning, 2022, 111 : 3489 - 3523
  • [2] Adaptive Classification Algorithm for Concept Drift Data Stream
    Cai H.
    Lu K.
    Wu Q.
    Wu D.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (03): : 633 - 646
  • [3] Uncertain Data Stream Classification with Concept Drift
    Lv Yanxia
    Wang Cuirong
    Wang Cong
    Liu Bingyu
    2016 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2016), 2016, : 265 - +
  • [4] Study on a classification model of data stream based on concept drift
    1600, Science and Engineering Research Support Society (09): : 363 - 372
  • [5] Scalable concept drift adaptation for stream data mining
    Hu, Lisha
    Li, Wenxiu
    Lu, Yaru
    Hu, Chunyu
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6725 - 6743
  • [6] Classification Method for Data Stream Based on Concept Drift Detection Technique
    Wang Jianhua
    Li Xiaofeng
    Gao Weiwei
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 637 - 640
  • [7] Online Classification Algorithm for Concept Drift and Class Imbalance Data Stream
    Lu K.-Z.
    Chen C.-F.
    Cai H.
    Wu D.-M.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (03): : 585 - 597
  • [8] An ensemble method for data stream classification in the presence of concept drift
    Omid Abbaszadeh
    Ali Amiri
    Ali Reza Khanteymoori
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 1059 - 1068
  • [9] Feature Selection for Handling Concept Drift in the Data Stream Classification
    Turkov, Pavel
    Krasotkina, Olga
    Mottl, Vadim
    Sychugov, Alexey
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 614 - 629
  • [10] An ensemble method for data stream classification in the presence of concept drift
    Omid ABBASZADEH
    Ali AMIRI
    Ali Reza KHANTEYMOORI
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 (12) : 1059 - 1068