An Extension of the Receiver Operating Characteristic Curve and AUC-Optimal Classification

被引:37
作者
Takenouchi, Takashi [1 ]
Komori, Osamu [2 ]
Eguchi, Shinto [2 ,3 ]
机构
[1] Future Univ Hakodate, Fac Syst Informat Sci, Dept Complex & Intelligent Syst, Hakodate, Hokkaido 0418655, Japan
[2] Inst Stat Math, Tachikawa, Tokyo 1908562, Japan
[3] Grad Univ Adv Studies, Dept Stat Sci, Tachikawa, Tokyo 1908562, Japan
基金
日本学术振兴会;
关键词
AREA;
D O I
10.1162/NECO_a_00336
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.
引用
收藏
页码:2789 / 2824
页数:36
相关论文
共 29 条
[1]  
[Anonymous], 2004, Proceedings of the 21st International Conference on Machine Learning (ICML), DOI DOI 10.1145/1015330.1015366
[2]  
[Anonymous], 2003, Journal of machine learning research
[3]  
[Anonymous], 2003, The Statistical Evaluation of Medical Tests for Classification and Prediction
[4]   The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer [J].
Baker, SG .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (07) :511-515
[5]  
Blake C. L., 1998, Uci repository of machine learning databases
[6]  
Boyd S.P, 2004, Convex optimization, DOI [DOI 10.1017/CBO9780511804441, 10.1017/CBO9780511804441]
[7]  
Brefeld U, 2005, P ICML 2005 WORKSH R
[8]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[9]  
Cortes C, 2004, ADV NEUR IN, V16, P313
[10]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139