Combining biomarkers linearly and nonlinearly for classification using the area under the ROC curve

被引：17

作者：

Fong, Youyi ^{[1
,2
]}

Yin, Shuxin ^{[1
]}

Huang, Ying ^{[1
,2
]}

机构：

[1] Fred Hutchinson Canc Res Ctr, Publ Hlth Sci, 1100 Fairview Ave N,M2-B500, Seattle, WA 98109 USA

[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA

来源：

STATISTICS IN MEDICINE | 2016年 / 35卷 / 21期

关键词：

AUC; biomarker combination; classification; kernel; ramp loss; ROC curve; SUPPORT VECTOR MACHINE; VARIABLE SELECTION; MODELS;

D O I：

10.1002/sim.6956

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

In biomedical studies, it is often of interest to classify/predict a subject's disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on area under the receiver operating characteristic curve (AUC). Many methods have been proposed to optimize approximated empirical AUC criteria, but there are two limitations to the existing methods. First, most methods are only designed to find the best linear combination of biomarkers, which may not perform well when there is strong nonlinearity in the data. Second, many existing linear combination methods use gradient-based algorithms to find the best marker combination, which often result in suboptimal local solutions. In this paper, we address these two problems by proposing a new kernel-based AUC optimization method called ramp AUC (RAUC). This method approximates the empirical AUC loss function with a ramp function and finds the best combination by a difference of convex functions algorithm. We show that as a linear combination method, RAUC leads to a consistent and asymptotically normal estimator of the linear marker combination when the data are generated from a semiparametric generalized linear model, just as the smoothed AUC method. Through simulation studies and real data examples, we demonstrate that RAUC outperforms smooth AUC in finding the best linear marker combinations, and can successfully capture nonlinear pattern in the data to achieve better classification performance. We illustrate our method with a dataset from a recent HIV vaccine trial. Copyright (c) 2016 John Wiley & Sons, Ltd.

引用

页码：3792 / 3809

页数：18

共 50 条

[31] A relationship between the incremental values of area under the ROC curve and of area under the precision-recall curve
Qian M. Zhou
Lu Zhe
Russell J. Brooke
Melissa M. Hudson
Yan Yuan
Diagnostic and Prognostic Research, 5 (1)
[32] Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?
Wald, N. J.
Bestwick, J. P.
JOURNAL OF MEDICAL SCREENING, 2014, 21 (01) : 51 - 56
[33] Non-parametric interval estimation for the partial area under the ROC curve
Qin, Gengsheng
Jin, Xiaoping
Zhou, Xiao-Hua
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2011, 39 (01): : 17 - 33
[34] Assessing classifiers in terms of the partial area under the ROC curve
Yousef, Waleed A.
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 64 : 51 - 70
[35] A boosting method for maximizing the partial area under the ROC curve
Komori, Osamu
Eguchi, Shinto
BMC BIOINFORMATICS, 2010, 11
[36] Consistency of non parametric estimators of the area under the ROC curve
Chrzanowski, Michal
Magiera, Ryszard
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2016, 45 (01) : 132 - 141
[37] Optimizing area under the ROC curve via extreme learning machines
Yang, Zhiyong
Zhang, Taohong
Lu, Jingcheng
Zhang, Dezheng
Kalui, Dorothy
KNOWLEDGE-BASED SYSTEMS, 2017, 130 : 74 - 89
[38] Estimating the uncertainty in the estimated mean area under the ROC curve of a classifier
Yousef, WA
Wagner, RF
Loew, MH
PATTERN RECOGNITION LETTERS, 2005, 26 (16) : 2600 - 2610
[39] Rank-based kernel estimation of the area under the ROC curve
Yin, Jingjing
Hao, Yi
Samawi, Hani
Rochani, Haresh
STATISTICAL METHODOLOGY, 2016, 32 : 91 - 106
[40] Measuring classifier performance: a coherent alternative to the area under the ROC curve
David J. Hand
Machine Learning, 2009, 77 : 103 - 123

← 1 2 3 4 5 →