EMRIL: Ensemble Method based on ReInforcement Learning for binary classification in imbalanced drifting data streams

被引:1
作者
Usman, Muhammad [1 ]
Chen, Huanhuan [1 ]
机构
[1] Univ Sci & Technol China, 96 JinZhai Rd, Hefei 230026, Anhui, Peoples R China
关键词
Data stream classification; Imbalance; Concept drift; Ensemble learning; Reinforcement Learning;
D O I
10.1016/j.neucom.2024.128259
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The co-occurrence of evolving concepts and imbalanced data deteriorates the learning performance of classifiers in data streams. Recent studies do not account for data difficulty factors associated with class imbalance, i.e. imbalance complexity, complicating the imbalance learning under a drifting data environment. This paper proposes EMRIL, a novel batch-based ensemble method, to deal with this challenge. As part of EMRIL, Imbalance Complexity Redressing Component ( EMRIL ICRC ), a data-level balancing module, resolves the imbalance complexity to increase minority class visibility for the base classifiers of the ensemble. Additionally, a novel ensemble pool management ( EMRIL EPM ) technique is designed using Reinforcement Learning (RL). EMRILEPM EPM regularly updates the ensemble pool and constructs an optimal base classifier subset for predictions through effective training and evaluation policies. Handling imbalance complexity, and RL-based ensemble pool management helps EMRIL to effectively perform the binary classification task in imbalanced and evolving data streams. A comprehensive experimental evaluation is conducted with 104 data streams which contain a variety of concept drifts and imbalance ratios categorized by various data difficulty factors. The results are compared with 15 state-of-the-art methods showing the superiority of the proposed method.
引用
收藏
页数:22
相关论文
共 85 条
[21]   ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams [J].
Cano, Alberto ;
Krawczyk, Bartosz .
MACHINE LEARNING, 2022, 111 (07) :2561-2599
[22]   Kappa Updated Ensemble for drifting data stream mining [J].
Cano, Alberto ;
Krawczyk, Bartosz .
MACHINE LEARNING, 2020, 109 (01) :175-218
[23]  
Chandra A, 2006, STUD COMP INTELL, V16, P429
[24]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[25]  
Chen HH, 2006, ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, P878
[26]  
Chen HH, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P392
[27]  
Chen HH, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3387
[28]   Cognitive fault diagnosis in Tennessee Eastman Process using learning in the model space [J].
Chen, Huanhuan ;
Tino, Peter ;
Yao, Xin .
COMPUTERS & CHEMICAL ENGINEERING, 2014, 67 :33-42
[29]   Multiobjective Neural Network Ensembles Based on Regularized Negative Correlation Learning [J].
Chen, Huanhuan ;
Yao, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (12) :1738-1751
[30]   Predictive Ensemble Pruning by Expectation Propagation [J].
Chen, Huanhuan ;
Tino, Peter ;
Yao, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (07) :999-1013