One-pass AUC optimization

被引:21
作者
Gao, Wei [1 ,2 ]
Wang, Lu [1 ,2 ]
Jin, Rong [3 ,4 ]
Zhu, Shenghuo [4 ]
Zhou, Zhi-Hua [1 ,2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
[2] Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 210023, Jiangsu, Peoples R China
[3] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[4] Alibaba Grp, Inst Data Sci & Technol, Seattle, WA USA
基金
美国国家科学基金会;
关键词
AUC; ROC curve; Online learning; Large-scale learning; Least square loss; Random projection; GENERALIZATION BOUNDS; RANKING; AREA;
D O I
10.1016/j.artint.2016.03.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
AUC is an important performance measure that has been used in diverse tasks, such as class-imbalanced learning, cost-sensitive learning, learning to rank, etc. In this work, we focus on one-pass AUC optimization that requires going through training data only once without having to store the entire training dataset. Conventional online learning algorithms cannot be applied directly to one-pass AUC optimization because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting in a storage requirement independent of the number of training data. To efficiently handle high-dimensional data, we develop two deterministic algorithms that approximate the covariance matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithms. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 29
页数:29
相关论文
共 53 条
  • [1] Learnability of bipartite ranking functions
    Agarwal, S
    Roth, D
    [J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 16 - 31
  • [2] Agarwal S, 2005, J MACH LEARN RES, V6, P393
  • [3] Agarwal S., 2013, C LEARNING THEORY, P338
  • [4] Agarwal S, 2009, J MACH LEARN RES, V10, P441
  • [5] [Anonymous], 2002, Statistical Methods in Diagnostic Medicine
  • [6] [Anonymous], 2003, Journal of machine learning research
  • [7] [Anonymous], 2005, P 22 INT C MACHINE L, DOI DOI 10.1145/1102351.1102399
  • [8] [Anonymous], 2013, ICML
  • [9] [Anonymous], 2004, P 21 INT C MACH LEAR, DOI DOI 10.1145/1015330.1015332
  • [10] [Anonymous], P 22 INT C MACH LEAR