DENSITY-SENSITIVE SEMISUPERVISED INFERENCE

被引:14
作者
Azizyan, Martin [1 ,2 ]
Singh, Aarti [1 ,2 ]
Wasserman, Larry [1 ,2 ]
机构
[1] Carnegie Mellon Univ, Dept Stat, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
关键词
Nonparametric inference; semisupervised; kernel density; efficiency;
D O I
10.1214/13-AOS1092
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Semisupervised methods are techniques for using labeled data (X-1, Y-1), ..., (X-n, Y-n) together with unlabeled data Xn+1, ..., X-N to make predictions. These methods invoke some assumptions that link the marginal distribution P-X of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of P-X. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution P-X. Our model includes a parameter alpha that controls the strength of the semisupervised assumption. We then use the data to adapt to alpha.
引用
收藏
页码:751 / 771
页数:21
相关论文
共 24 条
[1]  
[Anonymous], 2009, Advances in neural information processing systems
[2]  
AZIZYAN M., 2013, DENSITY SENSITIVE S, DOI [10.1214/13-A0S1092SUPP, DOI 10.1214/13-A0S1092SUPP]
[3]   Semi-supervised learning on Riemannian manifolds [J].
Belkin, M ;
Niyogi, P .
MACHINE LEARNING, 2004, 56 (1-3) :209-239
[4]  
Ben-David S., 2008, 21 ANN C LEARN THEOR
[5]  
BIJRAL A., 2011, 27 C UNC ART INT
[6]  
Bousquet Olivier, 2004, Advances in Neural Information Processing Systems, V16
[7]   ON THE EXPONENTIAL VALUE OF LABELED SAMPLES [J].
CASTELLI, V ;
COVER, TM .
PATTERN RECOGNITION LETTERS, 1995, 16 (01) :105-111
[8]   The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter [J].
Castelli, V ;
Cover, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (06) :2102-2117
[9]   On the Tchebychef inequality of Bernstein [J].
Craig, CC .
ANNALS OF MATHEMATICAL STATISTICS, 1933, 4 :94-102
[10]   An iterative algorithm for extending learners to a semi-supervised setting [J].
Culp, Mark ;
Michailidis, George .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2008, 17 (03) :545-571