GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework

被引:4
作者
Barua, Sukarna [1 ]
Islam, Md. Monirul [1 ]
Murase, Kazuyuki [2 ]
机构
[1] Bangladesh Univ Engn & Technol, Dhaka, Bangladesh
[2] Univ Fukui, Fukui 910, Japan
来源
NEURAL INFORMATION PROCESSING, PT I | 2015年 / 9489卷
关键词
Imbalanced learning; Online learning; Oversampling;
D O I
10.1007/978-3-319-26532-2_75
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online imbalanced learning has two important characteristics: samples of one class (minority class) are under-represented in the data set and samples come to the learner online incrementally. Such a data set may pose several problems to the learner. First, it is impossible to determine the minority class beforehand as the learner has no complete view of the whole data. Second, the status of imbalance may change over time. To handle such a data set efficiently, we present here a dynamic and adaptive algorithm called Generalized Over-Sampling based Online Imbalanced Learning (GOS-IL) framework. The proposed algorithm works by updating a base learner incrementally. This update is triggered when number of errors made by the learner crosses a threshold value. This deferred update helps the learner to avoid instantaneous harms of noisy samples and to achieve better generalization ability in the long run. In addition, correctly classified samples are not used by the algorithm to update the learner for avoiding over-fitting. Simulation results on some artificial and real world datasets show the effectiveness of the proposed method on two performance metrics: recall and g-mean.
引用
收藏
页码:680 / 687
页数:8
相关论文
共 12 条
[1]  
[Anonymous], 2003, P SIGKDD INT C KNOWL
[2]  
Barua Sukarna, 2013, Advances in Knowledge Discovery and Data Mining. 17th Pacific-Asia Conference (PAKDD 2013). Proceedings, P317, DOI 10.1007/978-3-642-37456-2_27
[3]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[4]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[5]  
Ciaramita M, 2008, P 17 INT C WORLD WID, P227, DOI DOI 10.1145/1367497.1367529
[6]   Prequential probability: principles and properties [J].
Dawid, AP ;
Vovk, VG .
BERNOULLI, 1999, 5 (01) :125-162
[7]   Recursive least square perceptron model for non-stationary and imbalanced data stream classification [J].
Ghazikhani A. ;
Monsefi R. ;
Sadoghi Yazdi H. .
Ghazikhani, A. (a_ghazikhani@yahoo.com), 1600, Springer Verlag (04) :119-131
[8]   Learning from Imbalanced Data [J].
He, Haibo ;
Garcia, Edwardo A. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) :1263-1284
[9]   ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning [J].
He, Haibo ;
Bai, Yang ;
Garcia, Edwardo A. ;
Li, Shutao .
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, :1322-1328
[10]   Weighted Online Sequential Extreme Learning Machine for Class Imbalance Learning [J].
Mirza, Bilal ;
Lin, Zhiping ;
Toh, Kar-Ann .
NEURAL PROCESSING LETTERS, 2013, 38 (03) :465-486