Active Learning with Abstaining Classifiers for Imbalanced Drifting Data Streams

被引:0
|
作者
Korycki, Lukasz [1 ]
Cano, Alberto [1 ]
Krawczyk, Bartosz [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2019年
关键词
machine learning; data stream mining; imbalanced data; active learning; ensemble learning; RESAMPLING ENSEMBLE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from data streams is one of the most promising and challenging domains in modern machine learning. Proliferating online data sources provide us access to real-time knowledge we have never had before. At the same time, new obstacles emerge and we have to overcome them in order to fully and effectively utilize the potential of the data. Prohibitive time and memory constraints or non-stationary distributions are only some of the problems. When dealing with classification tasks, one has to remember that effective adaptation has to be achieved on weak foundations of partially labeled and often imbalanced data. In our work, we propose an online framework for binary classification, that aims to handle the complex problem of working with dynamic, sparsely labeled and imbalanced streams. The main part of it is a novel active learning strategy (MD-OAL) that is able to prioritize labeling of minority instances and, as a result, improve the balance of the learning process. We combine the strategy with a dynamic ensemble of base learners that can abstain from making decisions, if they are very uncertain. We adjust the abstaining mechanism in favor of minority instances, providing an effective method for handling remaining imbalance and a concept drift simultaneously. The conducted evaluation shows that in the challenging and realistic scenarios our framework outperforms state-of-the-art algorithms, providing higher resilience to the combined effect of limited labeling and imbalance.
引用
收藏
页码:2334 / 2343
页数:10
相关论文
共 50 条
  • [41] Active learning for imbalanced data under cold start
    Barata, Ricardo
    Leite, Miguel
    Pacheco, Ricardo
    Sampaio, Marco O. P.
    Ascensao, Joao Tiago
    Bizarro, Pedro
    ICAIF 2021: THE SECOND ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, 2021,
  • [42] Blending Query Strategy of Active Learning for Imbalanced Data
    Kim, Gwangsu
    Yoo, Chang D.
    IEEE ACCESS, 2022, 10 : 79526 - 79542
  • [43] Sample Selection based Active Learning for Imbalanced Data
    Chairi, Ikram
    Alaoui, Souad
    Lyhyaoui, Abdelouahid
    10TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY AND INTERNET-BASED SYSTEMS SITIS 2014, 2014, : 645 - 651
  • [44] Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams
    Adel Ghazikhani
    Reza Monsefi
    Hadi Sadoghi Yazdi
    Neural Computing and Applications, 2013, 23 : 1283 - 1295
  • [45] Incremental Learning and Forgetting in One-Class Classifiers for Data Streams
    Krawczyk, Bartosz
    Wozniak, Michal
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS CORES 2013, 2013, 226 : 319 - 328
  • [46] Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams
    Nouri, Zahra
    Kiani, Vahid
    Fadishei, Hamid
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [47] Dynamic budget allocation for sparsely labeled drifting data streams
    Aguiar, Gabriel J.
    Cano, Alberto
    INFORMATION SCIENCES, 2024, 654
  • [48] Online active learning method for multi-class imbalanced data stream
    Ang Li
    Meng Han
    Dongliang Mu
    Zhihui Gao
    Shujuan Liu
    Knowledge and Information Systems, 2024, 66 : 2355 - 2391
  • [49] Collective of Base Classifiers for Mining Imbalanced Data
    Jedrzejowicz, Joanna
    Jedrzejowicz, Piotr
    COMPUTATIONAL SCIENCE, ICCS 2022, PT II, 2022, : 571 - 585
  • [50] Balanced Neighborhood Classifiers for Imbalanced Data Sets
    Zhu, Shunzhi
    Ma, Ying
    Pan, Weiwei
    Zhu, Xiatian
    Luo, Guangchun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (12): : 3226 - 3229