Estimating the support of a high-dimensional distribution

被引:3894
作者
Schölkopf, B
Platt, JC
Shawe-Taylor, J
Smola, AJ
Williamson, RC
机构
[1] Microsoft Res Ltd, Cambridge CB2 3NH, England
[2] Microsoft Res, Redmond, WA 98052 USA
[3] Univ London Royal Holloway & Bedford New Coll, Egham TW20 0EX, Surrey, England
[4] Australian Natl Univ, Dept Engn, Canberra, ACT 0200, Australia
关键词
D O I
10.1162/089976601750264965
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.
引用
收藏
页码:1443 / 1471
页数:29
相关论文
共 50 条
[1]  
[Anonymous], 1999, CD9914 NAT U SING DE
[2]  
[Anonymous], 19 NEUROCOLT
[3]   Learning distributions by their density levels: A paradigm for learning without a teacher [J].
BenDavid, S ;
Lindenbaum, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :171-182
[4]  
BERTSEKAS DP, 1995, NONLINEAR PROGRAMMIN
[5]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[6]  
CHEVALIER J, 1976, ANN I H POINCARE B, V12, P339
[7]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[8]  
Cuevas A, 1997, ANN STAT, V25, P2300
[9]   DETECTION OF ABNORMAL-BEHAVIOR VIA NONPARAMETRIC-ESTIMATION OF THE SUPPORT [J].
DEVROYE, L ;
WISE, GL .
SIAM JOURNAL ON APPLIED MATHEMATICS, 1980, 38 (03) :480-488
[10]   GENERALIZED QUANTILE PROCESSES [J].
EINMAHL, JHJ ;
MASON, DM .
ANNALS OF STATISTICS, 1992, 20 (02) :1062-1078