Boosted SVM with active learning strategy for imbalanced data

被引:24
作者
Zieba, Maciej [1 ]
Tomczak, Jakub M. [1 ]
机构
[1] Wroclaw Univ Technol, Fac Comp Sci & Management, PL-50370 Wroclaw, Poland
关键词
Imbalanced data; Boosted SVM; Active learning; SUPPORT VECTOR MACHINES; CREDIT; CLASSIFIERS; PREDICTION; SELECTION;
D O I
10.1007/s00500-014-1407-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we introduce a novel training method for constructing boosted Support Vector Machines (SVMs) directly from imbalanced data. The proposed solution incorporates the mechanisms of active learning strategy to eliminate redundant instances and more properly estimate misclassification costs for each of the base SVMs in the committee. To evaluate our approach, we make comprehensive experimental studies on the set of benchmark datasets with various types of imbalance ratio. In addition, we present application of our method to the real-life decision problem related to the short-term loans repayment prediction.
引用
收藏
页码:3357 / 3368
页数:12
相关论文
共 37 条
[1]  
[Anonymous], 1996, MACHINE LEARNING
[2]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[3]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[4]   RAMOBoost: Ranked Minority Oversampling in Boosting [J].
Chen, Sheng ;
He, Haibo ;
Garcia, Edwardo A. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (10) :1624-1642
[5]  
Craven MW, 1996, ADV NEUR IN, V8, P24
[6]  
Drummond Chris., 2000, P 17 INT C MACHINE L, P239
[7]  
Ertekin S., 2007, Active learning for class imbalance problem, P823, DOI [DOI 10.1145/1277741.1277927, 10.1145/1277741.1277927]
[8]  
Ertekin Seyda, 2007, P ACM C INF KNOWL MA, P127, DOI 10.1145/1321440.1321461
[9]  
Fan W, 1999, MACHINE LEARNING, PROCEEDINGS, P97
[10]  
Galar M, 2012, IEEE SYST MAN CYBERN, V42, P3358