SVM classification for imbalanced data sets using a multiobjective optimization framework

被引:12
作者
Askan, Aysegul [1 ]
Sayin, Serpil [2 ]
机构
[1] Garanti Teknol, TR-34212 Istanbul, Turkey
[2] Koc Univ, Coll Adm Sci & Econ, TR-34450 Istanbul, Turkey
关键词
SVM; Imbalanced data; Multiobjective optimization; Efficient frontier; ROBUST;
D O I
10.1007/s10479-012-1300-5
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Classification of imbalanced data sets in which negative instances outnumber the positive instances is a significant challenge. These data sets are commonly encountered in real-life problems. However, performance of well-known classifiers is limited in such cases. Various solution approaches have been proposed for the class imbalance problem using either data-level or algorithm-level modifications. Support Vector Machines (SVMs) that have a solid theoretical background also encounter a dramatic decrease in performance when the data distribution is imbalanced. In this study, we propose an L-1-norm SVM approach that is based on a three objective optimization problem so as to incorporate into the formulation the error sums for the two classes independently. Motivated by the inherent multi objective nature of the SVMs, the solution approach utilizes a reduction into two criteria formulations and investigates the efficient frontier systematically. The results indicate that a comprehensive treatment of distinct positive and negative error levels may lead to performance improvements that have varying degrees of increased computational effort.
引用
收藏
页码:191 / 203
页数:13
相关论文
共 32 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]  
[Anonymous], 1999, Proceedings of the International Joint Conference on Artificial Intelligence
[3]   Exploring the trade-off between generalization and empirical errors in a one-norm SVM [J].
Aytug, Haldun ;
Sayin, Serpil .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 218 (03) :667-675
[4]   VECTOR MAXIMIZATION WITH 2 OBJECTIVE FUNCTIONS [J].
BENSON, HP .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1979, 28 (02) :253-257
[5]  
Chan P. K., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P164
[6]  
Chawla N. V., 2004, ACM SIGKDD Explorations Newsletter, V6, P1
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]  
Chen X., 2005, P INT JOINT C NEUR N
[9]  
CPLEX, 2011, IBM ILOG CONC TECHN
[10]  
Cristianini N, 2002, ADV NEUR IN, V14, P367