Evaluation Measures of the Classification Performance of Imbalanced Data Sets

被引:199
作者
Gu, Qiong [1 ,2 ]
Zhu, Li [2 ]
Cai, Zhihua [2 ]
机构
[1] Xiangfan Univ, Fac Math & Comp Sci, Xiangfan 441053, Hubei, Peoples R China
[2] China Univ Geosci, Sch Comp, Wuhan 430074, Peoples R China
来源
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS | 2009年 / 51卷
关键词
Evaluation; classification performance; imbalanced data sets;
D O I
10.1007/978-3-642-04962-0_53
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discriminant Measures for Classification Performance play a critical role in guiding the design of classifiers, assessment methods and evaluation measures are at least as important as algorithm and are the first key stage to a successful data mining. We systematically summarized the evaluation measures of Imbalanced Data Sets (IDS). Several different type measures, such as commonly performance evaluation measures and visualizing classifier performance measures have been analyzed and compared. The problems of these measures towards IDS may lead to misunderstanding of classification results and even wrong strategy decision. Beside that, a series of complex numerical evaluation measures were also investigated which can also serve for evaluating classification performance of IDS.
引用
收藏
页码:461 / +
页数:2
相关论文
共 14 条
[1]  
[Anonymous], MACHINE LEARNING
[2]  
Biggerstaff BJ, 2000, STAT MED, V19, P649, DOI 10.1002/(SICI)1097-0258(20000315)19:5<649::AID-SIM371>3.0.CO
[3]  
2-H
[4]   NONINVASIVE CAROTID-ARTERY TESTING - A METAANALYTIC REVIEW [J].
BLAKELEY, DD ;
ODDONE, EZ ;
HASSELBLAD, V ;
SIMEL, DL ;
MATCHAR, DB .
ANNALS OF INTERNAL MEDICINE, 1995, 122 (05) :360-367
[5]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[6]  
Davis J., 2006, P 23 INT C MACH LEAR, P233, DOI [DOI 10.1145/1143844.1143874, 10.1145/1143844.1143874]
[7]   Cost curves: An improved method for visualizing classifier performance [J].
Drummond, Chris ;
Holte, Robert C. .
MACHINE LEARNING, 2006, 65 (01) :95-130
[8]   Machine learning for the detection of oil spills in satellite radar images [J].
Kubat, M ;
Holte, RC ;
Matwin, S .
MACHINE LEARNING, 1998, 30 (2-3) :195-215
[9]   Receiver operating characteristic methodology [J].
Pepe, MS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2000, 95 (449) :308-311
[10]  
Provost F., 1997, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, P43