One-pass AUC optimization

被引：23

作者：

Gao, Wei ^{[1
,2
]}

Wang, Lu ^{[1
,2
]}

Jin, Rong ^{[3
,4
]}

Zhu, Shenghuo ^{[4
]}

Zhou, Zhi-Hua ^{[1
,2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China

[2] Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 210023, Jiangsu, Peoples R China

[3] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA

[4] Alibaba Grp, Inst Data Sci & Technol, Seattle, WA USA

来源：

ARTIFICIAL INTELLIGENCE | 2016年 / 236卷

基金：

美国国家科学基金会;

关键词：

AUC; ROC curve; Online learning; Large-scale learning; Least square loss; Random projection; GENERALIZATION BOUNDS; RANKING; AREA;

D O I：

10.1016/j.artint.2016.03.003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

AUC is an important performance measure that has been used in diverse tasks, such as class-imbalanced learning, cost-sensitive learning, learning to rank, etc. In this work, we focus on one-pass AUC optimization that requires going through training data only once without having to store the entire training dataset. Conventional online learning algorithms cannot be applied directly to one-pass AUC optimization because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting in a storage requirement independent of the number of training data. To efficiently handle high-dimensional data, we develop two deterministic algorithms that approximate the covariance matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithms. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：1 / 29

页数：29

共 53 条

[31] Using AUC and accuracy in evaluating learning algorithms [J].

Huang, J ;

Ling, CX .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (03) :299-310

[32]

Joachims T., 2006, P 12 ACM SIGKDD INT, P217, DOI [DOI 10.1145/1150402.1150429, 10.1145/1150402.1150429]

[33]

Kar P., 2013, P MACHINE LEARNING R, P441

[34]

Kotlowski W., 2011, ICML, P1113

[35]

Langford J, 2009, J MACH LEARN RES, V10, P777

[36] Simple and Deterministic Matrix Sketching [J].

Liberty, Edo .

19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, :581-588

[37] Exploratory Undersampling for Class-Imbalance Learning [J].

Liu, Xu-Ying ;

Wu, Jianxin ;

Zhou, Zhi-Hua .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02) :539-550

[38]

McDiarmid C., 1989, SURVEYS COMBINATORIC, P148, DOI DOI 10.1017/CBO9781107359949.008

[39]

Menon Aditya Krishna, 2014, P 27 ANN C LEARN THE, V35, P68

[40] BASIC PRINCIPLES OF ROC ANALYSIS [J].

METZ, CE .

SEMINARS IN NUCLEAR MEDICINE, 1978, 8 (04) :283-298

← 1 2 3 4 5 6 →