F-measure curves: A tool to visualize classifier performance under imbalance

被引：49

作者：

Soleymani, Roghayeh ^{[1
]}

Granger, Eric ^{[1
]}

Fumera, Giorgio ^{[2
]}

机构：

[1] Univ Quebec, Dept Syst Engn, Lab Imagerie Vis & Intelligence Artificielle LIVI, Ecole Technol Super, Montreal, PQ, Canada

[2] Univ Cagliari, Pattern Recognit & Applicat Grp, Dept Elect & Elect Engn, Cagliari, Italy

来源：

PATTERN RECOGNITION | 2020年 / 100卷

关键词：

Pattern classification; Class imbalance; Performance metrics; F-measure; Visualization tools; Video face recognition; FACE RECOGNITION; ADAPTIVE ENSEMBLES; SURVEILLANCE;

D O I：

10.1016/j.patcog.2019.107146

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning from imbalanced data is a challenging problem in many real-world machine learning applications due in part to the bias of performance in most classification systems. This bias may exist due to three reasons: (1) Classification systems are often optimized and compared using performance measurements that are unsuitable for imbalance problems; (2) most learning algorithms are designed and tested on a fixed imbalance level of data, which may differ from operational scenarios; (3) the preference of correct classification of classes is different from one application to another. This paper investigates specialized performance evaluation metrics and tools for imbalance problem, including scalar metrics that assume a given operating condition (skew level and relative preference of classes), and global evaluation curves or metrics that consider a range of operating conditions. We focus on the case in which the scalar metric F-measure is preferred over other scalar metrics, and propose a new global evaluation space for the F-measure that is analogous to the cost curves for expected cost. In this space, a classifier is represented as a curve that shows its performance over all of its decision thresholds and a range of possible imbalance levels for the desired preference of true positive rate to precision. Curves obtained in the F-measure space are compared to those of existing spaces (ROC, precision-recall and cost) and analogously to cost curves. The proposed F-measure space allows to visualize and compare classifiers' performance under different operating conditions more easily than in ROC and precision-recall spaces. This space allows us to set the optimal decision threshold of a soft classifier and to select the best classifier among a group. This space also allows to empirically improve the performance obtained with ensemble learning methods specialized for class imbalance, by selecting and combining the base classifiers for ensembles using a modified version of the iterative Boolean combination algorithm that is optimized using the F-measure instead of AUC. Experiments on a real-world dataset for video face recognition show the advantages of evaluating and comparing different classifiers in the F-measure space versus ROC, precision-recall, and cost spaces. In addition, it is shown that the performance evaluated using the F-measure of Bagging ensemble method can improve considerably by using the modified iterative Boolean combination algorithm. (C) 2019 Published by Elsevier Ltd.

引用

页数：19

共 43 条

[21] An experimental comparison of performance measures for classification
Ferri, C.
Hernandez-Orallo, J.
Modroiu, R.
[J]. PATTERN RECOGNITION LETTERS, 2009, 30 (01) : 27 - 38
[22] Garcia V., 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P617, DOI 10.1109/ICPR.2010.156
[23] Haker S, 2005, LECT NOTES COMPUT SC, V3749, P506
[24] Learning from Imbalanced Data
He, Haibo
Garcia, Edwardo A.
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) : 1263 - 1284
[25] A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database
Huang, Zhiwu
Shan, Shiguang
Wang, Ruiping
Zhang, Haihong
Lao, Shihong
Kuerban, Alifu
Chen, Xilin
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5967 - 5981
[26] Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs
Khreich, Wael
Granger, Eric
Miri, Ali
Sabourin, Robert
[J]. PATTERN RECOGNITION, 2010, 43 (08) : 2732 - 2752
[27] Learning from imbalanced data: open challenges and future directions
Krawczyk B.
[J]. Krawczyk, Bartosz (bartosz.krawczyk@pwr.edu.pl), 1600, Springer Verlag (05): : 221 - 232
[28] Machine learning for the detection of oil spills in satellite radar images
Kubat, M
Holte, RC
Matwin, S
[J]. MACHINE LEARNING, 1998, 30 (2-3) : 195 - 215
[29] Landgrebe TCW, 2006, INT C PATT RECOG, P123
[30] Lipton P, 2014, ROUTL PHILOS COMPAN, P225

← 1 2 3 4 5 →