Dynamic weighted selective ensemble learning algorithm for imbalanced data streams

被引:0
作者
Zhang Yan
Du Hongle
Ke Gang
Zhang Lin
Yeh-Cheng Chen
机构
[1] Shangluo University,School of Mathematics and Computer Application
[2] Shangluo Public Big Data Research Center,Department of Computer Engineering
[3] Dongguan Polytechnic,Department of Computer Science
[4] University of California,undefined
来源
The Journal of Supercomputing | 2022年 / 78卷
关键词
Concept drift; Imbalanced data stream; Data stream mining; Ensemble learning;
D O I
暂无
中图分类号
学科分类号
摘要
Data stream mining is one of the hot topics in data mining. Most existing algorithms assume that data stream with concept drift is balanced. However, in real-world, the data streams are imbalanced with concept drift. The learning algorithm will be more complex for the imbalanced data stream with concept drift. In online learning algorithm, the oversampling method is used to select a small number of samples from the previous data block through a certain strategy and add them into the current data block to amplify the current minority class. However, in this method, the number of stored samples, the method of oversampling and the weight calculation of base-classifier all affect the classification performance of ensemble classifier. This paper proposes a dynamic weighted selective ensemble (DWSE) learning algorithm for imbalanced data stream with concept drift. On the one hand, through resampling the minority samples in previous data block, the minority samples of the current data block can be amplified, and the information in the previous data block can be absorbed into building a classifier to reduce the impact of concept drift. The calculation method of information content of every sample is defined, and the resampling method and updating method of the minority samples are given in this paper. On the other hand, because of concept drift, the performance of the base-classifier will be degraded, and the decay factor is usually used to describe the performance degradation of base-classifier. However, the static decay factor cannot accurately describe the performance degradation of the base-classifier with the concept drift. The calculation method of dynamic decay factor of the base-classifier is defined in DWSE algorithm to select sub-classifiers to eliminate according to the attenuation situation, which makes the algorithm better deal with concept drift. Compared with other algorithms, the results show that the DWSE algorithm has better classification performance for majority class samples and minority samples.
引用
收藏
页码:5394 / 5419
页数:25
相关论文
共 50 条
[11]   EMRIL: Ensemble Method based on ReInforcement Learning for binary classification in imbalanced drifting data streams [J].
Usman, Muhammad ;
Chen, Huanhuan .
NEUROCOMPUTING, 2024, 605
[12]   Incremental Weighted Ensemble for Data Streams With Concept Drift [J].
Jiao B. ;
Guo Y. ;
Yang C. ;
Pu J. ;
Zheng Z. ;
Gong D. .
IEEE Transactions on Artificial Intelligence, 2024, 5 (01) :92-103
[13]   The Gradual Resampling Ensemble for mining imbalanced data streams with concept drift [J].
Ren, Siqi ;
Liao, Bo ;
Zhu, Wen ;
Li, Zeng ;
Liu, Wei ;
Li, Keqin .
NEUROCOMPUTING, 2018, 286 :150-166
[14]   Incremental Learning Algorithm for Dynamic Data Streams [J].
Kuthadi, Venu Madhav ;
Govardhan, A. ;
Chand, P. Prem .
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (09) :338-345
[15]   Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble [J].
Sarnovsky M. ;
Kolarik M. .
PeerJ Computer Science, 2021, 7 :1-31
[16]   Learning from Imbalanced Data Streams Using Rotation-Based Ensemble Classifiers [J].
Czarnowski, Ireneusz .
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 :794-805
[17]   Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble [J].
Sarnovsky, Martin ;
Kolarik, Michal .
PEERJ COMPUTER SCIENCE, 2021,
[18]   Droplet Ensemble Learning on Drifting Data Streams [J].
Loeffel, Pierre-Xavier ;
Bifet, Albert ;
Marsala, Christophe ;
Detyniecki, Marcin .
ADVANCES IN INTELLIGENT DATA ANALYSIS XVI, IDA 2017, 2017, 10584 :210-222
[19]   A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams [J].
Junaid, K. A. Mohamed ;
Paulraj, D. ;
Sethukarasi, T. .
WIRELESS NETWORKS, 2025, 31 (01) :19-30
[20]   Changing Lineup Classifier Ensemble for Drifting Imbalanced Data Streams [J].
Wegier, Weronika ;
Maczynski, Maciej ;
Wozniak, Michal .
2024 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2024, :238-245