Adjusted F-measure and kernel scaling for imbalanced data learning

被引:110
作者
Maratea, Antonio [1 ]
Petrosino, Alfredo [1 ]
Manzo, Mario [1 ]
机构
[1] Univ Naples Parthenope, Ctr Direz, Dept Appl Sci, I-80143 Naples, Italy
关键词
Imbalanced learning; Asymmetric SVM; Kernel scaling; Conformal transformation; Adjusted F-measure; SUPPORT VECTOR MACHINES; CLASSIFICATION;
D O I
10.1016/j.ins.2013.04.016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rare events are involved in many challenging real world classification problems, where the minority class is usually the most expensive to sample and to label. As a consequence, training data are often imbalanced, presenting an heavily skewed distribution of labels. Using conventional classification techniques produces biased results, as the classifier may easily show a very good performance on the over-represented class and a very poor performance on the under-represented class: the former dominates the learning process and tends to attract all predictions. Furthermore, the classical accuracy measure is misleading, as it assumes equal importance for the true positives and the true negatives. We propose a classification procedure based on Support Vector Machine able to effectively cope with data imbalance. Using a first step approximate solution and then a suitable kernel transformation, we enlarge asymmetrically space around the class boundary, compensating data skewness. We also propose an accuracy measure, named AGF, that properly accounts for the different misclassification costs of the two classes. Tests on real world data from a public repository show that the proposed approach outperforms its competitors. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:331 / 341
页数:11
相关论文
共 39 条
  • [1] Applying support vector machines to imbalanced datasets
    Akbani, R
    Kwek, S
    Japkowicz, N
    [J]. MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 39 - 50
  • [2] Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
  • [3] Improving support vector machine classifiers by modifying kernel functions
    Amari, S
    Wu, S
    [J]. NEURAL NETWORKS, 1999, 12 (06) : 783 - 789
  • [4] [Anonymous], 1999, Proceedings of the International Joint Conference on Artificial Intelligence
  • [5] [Anonymous], 2014, C4. 5: programs for machine learning
  • [6] [Anonymous], 2012, IEEE T SYST MAN CY C, DOI DOI 10.1109/TSMCC.2011.2161285
  • [7] [Anonymous], LECT NOTES COMPUTER
  • [8] [Anonymous], 2012, P INT JOINT C NEUR N, P1
  • [9] Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
  • [10] Burges C.J.C., 1999, ADV KERNEL METHODSSU, P89116