Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data

被引：0

作者：

S. Priya

R. Annie Uthra

机构：

[1] SRM Institute of Science and Technology,Department of Computer Science and Engineering, College of Engineering and Technology

来源：

Complex & Intelligent Systems | 2023年 / 9卷

关键词：

Data science; Complex systems; Decision making; Streaming data; Concept drift; Classification model; Deep learning; Class imbalance data;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.

引用

页码：3499 / 3515

页数：16

共 39 条

[1] Cervellera C(2017)Distribution-preserving stratified sampling for learning problems IEEE Trans Neural Netw Learn Syst 27 2886-2895
[2] Macciò D(2016)DENDIS: a new density-based sampling for clustering algorithm Expert Syst Appl 56 349-359
[3] Ros F(2018)The gradual resampling ensemble for mining imbalanced data streams with concept drift Neurocomputing 286 150-166
[4] Guillaume S(2018)Propagation of misclassified instances to handle nonstationary imbalanced data stream J Eng Sci Technol 13 1134-1142
[5] Ren S(2019)Learning concept drift with ensembles of optimum-path forest-based classifiers Future Gener Comput Syst 95 198-211
[6] Liao Bo(2018)Knowledge-maximized ensemble algorithm for different types of concept drift Inform Sci 430 261-281
[7] Zhu W(2019)Type 2 diabetes data classification using stacked autoencoders in deep neural networks Clin Epidemiol Glob Health 7 530-535
[8] Li Z(2020)Comparative study of first order optimizers for image classification using convolutional neural networks on histopathology images J Imaging 6 92-undefined
[9] Liu W(2016)Boosting accuracy of classical machine learning antispam classifiers in real scenarios by applying rough set theory Sci Program 2016 5-undefined
[10] Li K(2020)A heterogeneous ensemble learning framework for spam detection in social networks with imbalanced data Appl Sci 10 936-undefined

← 1 2 3 4 →