ROC curves and nonrandom data

被引：23

作者：

Cook, Jonathan Aaron ^{[1
]}

机构：

[1] Publ Co Accounting Oversight Board, 1666 K St NW, Washington, DC USA

来源：

PATTERN RECOGNITION LETTERS | 2017年 / 85卷

关键词：

ROC curves; Classifier evaluation; Sample-selection bias; PREDICT CLASSIFICATION PERFORMANCE; SAMPLE SELECTION; MODELS;

D O I：

10.1016/j.patrec.2016.11.015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper shows that when a classifier is evaluated with nonrandom test data, ROC curves differ from the ROC curves that would be obtained with a random sample. To address this bias, this paper introduces a procedure for plotting ROC curves that are inferred from nonrandom test data. I provide simulations to illustrate the procedure as well as the magnitude of bias that is found in empirical ROC curves constructed with nonrandom test data. The paper also includes a demonstration of the procedure on (non-simulated) data used to model wine preferences in the wine industry. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：35 / 41

页数：7

共 23 条

[1] The use of the area under the roc curve in the evaluation of machine learning algorithms [J].

Bradley, AP .

PATTERN RECOGNITION, 1997, 30 (07) :1145-1159

[2] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[3] Modeling wine preferences by data mining from physicochemical properties [J].

Cortez, Paulo ;

Cerdeira, Antonio ;

Almeida, Fernando ;

Matos, Telmo ;

Reis, Jose .

DECISION SUPPORT SYSTEMS, 2009, 47 (04) :547-553

[4] Does reject inference really improve the performance of application scoring models? [J].

Crook, J ;

Banasik, J .

JOURNAL OF BANKING & FINANCE, 2004, 28 (04) :857-874

[5]

Davis J., 2006, ICML 06, DOI 10.1145/1143844.1143874

[6] MAXIMUM-LIKELIHOOD ESTIMATION OF PARAMETERS OF SIGNAL-DETECTION THEORY AND DETERMINATION OF CONFIDENCE INTERVALS - RATING-METHOD DATA [J].

DORFMAN, DD ;

ALF, E .

JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1969, 6 (03) :487-&

[7] Bayesian semi-parametric ROC analysis [J].

Erkanli, Alaattin ;

Sung, Minje ;

Costello, E. Jane ;

Angold, Adrian .

STATISTICS IN MEDICINE, 2006, 25 (22) :3905-3928

[8] A response to Webb and Ting's On the application of ROC analysis to predict classification performance under varying class distributions [J].

Fawcett, T ;

Flach, PA .

MACHINE LEARNING, 2005, 58 (01) :33-38

[9] An introduction to ROC analysis [J].

Fawcett, Tom .

PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874

[10]

He Haibo., 2011, SELF ADAPTIVE SYSTEM

← 1 2 3 →