A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

被引:0
作者
Junaid, K. A. Mohamed [1 ]
Paulraj, D. [2 ]
Sethukarasi, T. [2 ]
机构
[1] R M K Engn Coll, Dept Elect & Commun Engn, Chennai, India
[2] R M K Engn Coll, Dept Comp Sci & Engn, Chennai, India
关键词
Machine learning; Ensemble classifier; Concept drift; Heterogeneous data stream;
D O I
10.1007/s11276-024-03742-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data stream mining is essential in various fields such as education, the Internet of Things (IoT), social media, entertainment, weather monitoring, and finance. This is due to the continuous and huge amount of data generated by applications in these sectors. Moreover, this data stream is prone to concept drift, in addition to showing characteristics of heterogeneity and imbalance. Contemporary methods for addressing unbalanced learning in data mining often employ classifiers that are tailored to the number of features required for categorization. The control of concept drift is an absolute necessity due to the ever-changing data distributions and the endless and rapid nature of the various data streams. Concept drift is an obstacle in heterogeneous stream data mining, marked by noticeable variations that can range from massive to more complex changes. When addressing drifts, conventional approaches often employ fixed-size blocks or windows, posing challenges in managing events that are in a continuous state of change. This paper introduces a novel approach called "Ensemble Classification Techniques Detecting and Managing Concept Drift in Dynamic and Imbalanced Data Streams" to address these issues. Our method aims to effectively adjust to different types of concept drift by providing a precise and flexible classification of distinct data streams. The suggested ensemble classifier is a valuable contribution to stream data mining, since it effectively addresses the intricate challenges associated with dynamic concept drifts. Experimental results proved that the proposed method has demonstrated superior performance compared to existing methods. According to the findings of the experiment, the proposed method obtains a precision of 69.28% and a recall rate of 69.54%, which gives it an advantage over other methods that produce results that are almost identical.
引用
收藏
页码:19 / 30
页数:12
相关论文
共 31 条
[1]   Optimizing energy consumption in WSN-based IoT using unequal clustering and sleep scheduling methods [J].
Abdulzahra, Ali Mohammed Kadhim ;
Al-Qurabat, Ali Kadhum M. ;
Abdulzahra, Suha Abdulhussein .
INTERNET OF THINGS, 2023, 22
[2]  
Abdulzahra Ali Mohammed Kadhim, 2022, KARBALA INT J MOD SC, V8, P579, DOI DOI 10.33640/2405-609X.3259
[3]   Compression-based Data Reduction Technique for IoT Sensor Networks [J].
Abdulzahra, Suha Abdulhussein ;
Al-Qurabat, Ali Kadhum M. ;
Idrees, Ali Kadhum .
BAGHDAD SCIENCE JOURNAL, 2021, 18 (01) :184-198
[4]   RETRACTED: A Novel of New 7D Hyperchaotic System with Self-Excited Attractors and Its Hybrid Synchronization (Retracted Article) [J].
Al-Obeidi, Ahmed S. ;
Fawzi Al-Azzawi, Saad ;
Abdullah Hamad, Abdulsattar ;
Thivagar, M. Lellis ;
Meraf, Zelalem ;
Ahmad, Sultan .
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[5]  
Al-Qurabat Ali Kadhum M., 2020, IOP Conference Series: Materials Science and Engineering, V928, DOI 10.1088/1757-899X/928/3/032055
[6]  
Al-Qurabat AKM, 2022, INT J COMPUT APPL T, V68, P357, DOI [10.1504/IJCAT.2022.10050317, 10.1504/IJCAT.2022.125182]
[7]   Handling imbalanced data with concept drift by applying dynamic sampling and ensemble classification model [J].
Ancy, S. ;
Paulraj, D. .
COMPUTER COMMUNICATIONS, 2020, 153 :553-560
[8]   Online Learning Model for Handling Different Concept Drifts Using Diverse Ensemble Classifiers on Evolving Data Streams [J].
Ancy, S. ;
Paulraj, D. .
CYBERNETICS AND SYSTEMS, 2019, 50 (07) :579-608
[9]  
[Anonymous], 2018, Apache Spark-Unified Analytics Engine for Big Data
[10]  
Apache Flink, 2018, STATEFUL COMPUTATION