A framework for monitoring classifiers' performance: when and why failure occurs?

被引：55

作者：

Cieslak, David A. ^{[1
]}

Chawla, Nitesh V. ^{[1
]}

机构：

[1] Univ Notre Dame, Dept Comp Sci, Notre Dame, IN 46556 USA

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2009年 / 18卷 / 01期

关键词：

Mach Learn; Friedman Test; Predictive Distribution; Class Prior; Sample Selection Bias;

D O I：

10.1007/s10115-008-0139-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Classifier error is the product of model bias and data variance. While understanding the bias involved when selecting a given learning algorithm, it is similarly important to understand the variability in data over time, since even the One True Model might perform poorly when training and evaluation samples diverge. Thus, it becomes the ability to identify distributional divergence is critical towards pinpointing when fracture points in classifier performance will occur, particularly since contemporary methods such as tenfolds and hold-out are poor predictors in divergent circumstances. This article implement a comprehensive evaluation framework to proactively detect breakpoints in classifiers' predictions and shifts in data distributions through a series of statistical tests. We outline and utilize three scenarios under which data changes: sample selection bias, covariate shift, and shifting class priors. We evaluate the framework with a variety of classifiers and datasets.

引用

页码：83 / 108

页数：26

共 26 条

[1] [Anonymous], 2004, KDD, DOI DOI 10.1073/pnas.0901650106
[2] [Anonymous], 1998, UCI REPOSITORY MACHI
[3] [Anonymous], 1996, COMPUTATIONAL LEARNI
[4] Learning from labeled and unlabeled data: An empirical study across techniques and domains
Chawla, N
Karakoulas, G
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 23 : 331 - 366
[5] FAN W, 2006, P KDD
[6] Gibbons J.D., 1985, NONPARAMETRIC STAT I
[7] A quantitative analysis of the robustness of knowledge-based systems through degradation studies
Groot, P
ten Teije, A
van Harmelen, F
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (02) : 224 - 245
[8] THE ELECTROTOPOLOGICAL STATE - STRUCTURE INFORMATION AT THE ATOMIC LEVEL FOR MOLECULAR GRAPHS
HALL, LH
MOHNEY, B
KIER, LB
[J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1991, 31 (01): : 76 - 82
[9] SAMPLE SELECTION BIAS AS A SPECIFICATION ERROR
HECKMAN, JJ
[J]. ECONOMETRICA, 1979, 47 (01) : 153 - 161
[10] DIVERGENCE AND BHATTACHARYYA DISTANCE MEASURES IN SIGNAL SELECTION
KAILATH, T
[J]. IEEE TRANSACTIONS ON COMMUNICATION TECHNOLOGY, 1967, CO15 (01): : 52 - &

← 1 2 3 →