A novel approach using incremental oversampling for data stream mining

被引:5
作者
Anupama, N. [1 ]
Jena, Sudarson [2 ]
机构
[1] GITAM Univ, Hyderabad, India
[2] Sambalpur Univ, Inst Informat Technol, Sambalpur, India
关键词
Knowledge discovery; Data streams; Imbalanced data; Oversampling; Increment over sampling for data streams (IOSDS); CLASSIFICATION;
D O I
10.1007/s12530-018-9249-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream mining is very popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms is been compromised with imbalance nature present in real world data streams. In this paper we propose a novel algorithm dubbed as increment over sampling for data streams (IOSDS) which uses an unique over sampling technique to almost balance the data sets to minimize the effect of imbalance in stream mining process. The experimental analysis is conducted on 15 data chunks of data streams with varied sizes and different imbalance ratios. The results suggests that the proposed IOSDS algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of standard performance measures namely accuracy, AUC, precision, recall and F-measure.
引用
收藏
页码:351 / 362
页数:12
相关论文
共 30 条
[21]   Comparing machine learning classifiers in potential distribution modelling [J].
Lorena, Ana C. ;
Jacintho, Luis F. O. ;
Siqueira, Marinez F. ;
De Giovanni, Renato ;
Lohmann, Lucia G. ;
de Carvalho, Andre C. P. L. F. ;
Yamamoto, Missae .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) :5268-5275
[22]   Reliable All-Pairs Evolving Fuzzy Classifiers [J].
Lughofer, Edwin ;
Buchtala, Oliver .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2013, 21 (04) :625-641
[23]   Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances [J].
Lughofer, Edwin ;
Weigl, Eva ;
Heidl, Wolfgang ;
Eitzinger, Christian ;
Radauer, Thomas .
INFORMATION SCIENCES, 2016, 355 :127-151
[24]   Integrating new classes on the fly in evolving fuzzy classifier designs and their application in visual inspection [J].
Lughofer, Edwin ;
Weigl, Eva ;
Heidl, Wolfgang ;
Eitzinger, Christian ;
Radauer, Thomas .
APPLIED SOFT COMPUTING, 2015, 35 :558-582
[25]  
Quinlan J. R., 2014, C4 5 PROGRAMS MACHIN
[26]  
Sayed-Mouchaweh M., 2012, Learning in Non-Stationary En- vironments: Methods and Applications
[27]   A Dynamic Ensemble Framework for Mining Textual Streams with Class Imbalance [J].
Song, Ge ;
Ye, Yunming .
SCIENTIFIC WORLD JOURNAL, 2014,
[28]  
Thalor M.A., 2016, INT J ELECT COMPUTER, V6, P1811
[29]   Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection [J].
Verbiest, Nele ;
Ramentol, Enislay ;
Cornelis, Chris ;
Herrera, Francisco .
APPLIED SOFT COMPUTING, 2014, 22 :511-517
[30]   Resampling-Based Ensemble Methods for Online Class Imbalance Learning [J].
Wang, Shuo ;
Minku, Leandro L. ;
Yao, Xin .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (05) :1356-1368