Evaluation Measures of the Classification Performance of Imbalanced Data Sets

被引:185
|
作者
Gu, Qiong [1 ,2 ]
Zhu, Li [2 ]
Cai, Zhihua [2 ]
机构
[1] Xiangfan Univ, Fac Math & Comp Sci, Xiangfan 441053, Hubei, Peoples R China
[2] China Univ Geosci, Sch Comp, Wuhan 430074, Peoples R China
来源
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS | 2009年 / 51卷
关键词
Evaluation; classification performance; imbalanced data sets;
D O I
10.1007/978-3-642-04962-0_53
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discriminant Measures for Classification Performance play a critical role in guiding the design of classifiers, assessment methods and evaluation measures are at least as important as algorithm and are the first key stage to a successful data mining. We systematically summarized the evaluation measures of Imbalanced Data Sets (IDS). Several different type measures, such as commonly performance evaluation measures and visualizing classifier performance measures have been analyzed and compared. The problems of these measures towards IDS may lead to misunderstanding of classification results and even wrong strategy decision. Beside that, a series of complex numerical evaluation measures were also investigated which can also serve for evaluating classification performance of IDS.
引用
收藏
页码:461 / +
页数:2
相关论文
共 50 条
  • [21] AUC Evaluation of Multi-class Classifier Performance in Imbalanced Data
    Ni, Huangjing
    Wang, Wei
    2010 INTERNATIONAL CONFERENCE ON FUTURE CONTROL AND AUTOMATION (ICFCA 2010), 2010, : 48 - 51
  • [22] Handling imbalanced data sets with a modification of Decorate algorithm
    Kotsiantis, Sotiris B.
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2008, 33 (2-3) : 91 - 98
  • [23] Boosting support vector machines for imbalanced data sets
    Benjamin X. Wang
    Nathalie Japkowicz
    Knowledge and Information Systems, 2010, 25 : 1 - 20
  • [24] Boosting support vector machines for imbalanced data sets
    Wang, Benjamin X.
    Japkowicz, Nathalie
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 25 (01) : 1 - 20
  • [25] Probabilistic mapping of imbalanced data for groundwater contamination using classification algorithms: Performance and reliability
    Qiu, Yang
    Zhou, Aiguo
    Xiong, Hanxiang
    Zhang, Defang
    Su, Cheng
    Zhou, Shizheng
    Go, Lin
    Yang, Chi
    Cui, Hao
    Fan, Wei
    Yu, Yao
    Zhang, Fawang
    Ma, Chuanming
    GROUNDWATER FOR SUSTAINABLE DEVELOPMENT, 2025, 28
  • [26] Construction of Neurofuzzy Models For Imbalanced Data Classification
    Gao, Ming
    Hong, Xia
    Harris, Chris J.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2014, 22 (06) : 1472 - 1488
  • [27] Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches
    Fernandez, Alberto
    Lopez, Victoria
    Galar, Mikel
    Jose del Jesus, Maria
    Herrera, Francisco
    KNOWLEDGE-BASED SYSTEMS, 2013, 42 : 97 - 110
  • [28] The effect of imbalanced data sets on LDA: A theoretical and empirical analysis
    Xie, Jigang
    Qiu, Zhengding
    PATTERN RECOGNITION, 2007, 40 (02) : 557 - 562
  • [29] Editing Training Sets from Imbalanced Data Using Fuzzy-Rough Sets
    Nguyen, Do Van
    Ogawa, Keisuke
    Matsumoto, Kazunori
    Hashimoto, Masayuki
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, 2015, 458 : 115 - 129
  • [30] Local cost sensitive learning for handling imbalanced data sets
    Karagiannopoulos, M. G.
    Anyfantis, D. S.
    Kotsiantis, S. B.
    Pintelas, P. E.
    2007 MEDITERRANEAN CONFERENCE ON CONTROL & AUTOMATION, VOLS 1-4, 2007, : 235 - 240