Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift

被引:0
作者
Lu, Yang [1 ]
Cheung, Yiu-ming [1 ,2 ]
Tang, Yuan Yan [3 ]
机构
[1] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[2] HKBU Inst Res & Continuing Educ, Shenzhen, Peoples R China
[3] Univ Macau, Dept Comp & Informat Sci, Taipa, Macao, Peoples R China
来源
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年
基金
中国国家自然科学基金;
关键词
ENSEMBLE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-based incremental learning method called Dynamic Weighted Majority for Imbalance Learning (DWMIL) to deal with the data streams with concept drift and class imbalance problem. DWMIL utilizes an ensemble framework by dynamically weighting the base classifiers according to their performance on the current data chunk. Compared with the existing methods, its merits are four-fold: (1) it can keep stable for non-drifted streams and quickly adapt to the new concept; (2) it is totally incremental, i.e. no previous data needs to be stored; (3) it keeps a limited number of classifiers to ensure high efficiency; and (4) it is simple and needs only one thresholding parameter. Experiments on both synthetic and real data sets with concept drift show that DWMIL performs better than the state-of-the-art competitors, with less computational cost.
引用
收藏
页码:2393 / 2399
页数:7
相关论文
共 20 条
[1]  
[Anonymous], 2013, EVIDENCE BASED COMPL, V2013, P1, DOI [10.1155/2013/376327, DOI 10.4209/AAQR.2013.01.0020]
[2]  
[Anonymous], 2013, INT J COMPUTATIONAL, V1, P1
[3]  
[Anonymous], 2012, P 18 ACM SIGKDD INT, DOI [10.1145/2339530.2339558, DOI 10.1145/2339530.2339558]
[4]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[5]  
Breiman Leo., 1984, Classi fication and Regression Trees
[6]   Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach [J].
Chen, Sheng ;
He, Haibo .
EVOLVING SYSTEMS, 2011, 2 (01) :35-50
[7]   Learning in Nonstationary Environments: A Survey [J].
Ditzler, Gregory ;
Roveri, Manuel ;
Alippi, Cesare ;
Polikar, Robi .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2015, 10 (04) :12-25
[8]   Incremental Learning of Concept Drift from Streaming Imbalanced Data [J].
Ditzler, Gregory ;
Polikar, Robi .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (10) :2283-2301
[9]   Incremental Learning of Concept Drift in Nonstationary Environments [J].
Elwell, Ryan ;
Polikar, Robi .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (10) :1517-1531
[10]  
Gao J, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P3