TESTING THE MANIFOLD HYPOTHESIS

被引:203
作者
Fefferman, Charles [1 ]
Mitter, Sanjoy [2 ]
Narayanan, Hariharan [3 ,4 ]
机构
[1] Princeton Univ, Dept Math, Princeton, NJ 08544 USA
[2] MIT, Lab Informat & Decis Syst, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[4] Univ Washington, Dept Math, Seattle, WA 98195 USA
基金
美国国家科学基金会;
关键词
NONLINEAR DIMENSIONALITY REDUCTION; STATISTICAL VARIABLES; PRINCIPAL; EIGENMAPS; TOPOLOGY; COMPLEX; POINTS;
D O I
10.1090/jams/852
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The hypothesis that high dimensional data tend to lie in the vicinity of a low dimensional manifold is the basis of manifold learning. The goal of this paper is to develop an algorithm (with accompanying complexity guarantees) for testing the existence of a manifold that fits a probability distribution supported in a separable Hilbert space, only using i.i.d. samples from that distribution. More precisely, our setting is the following. Suppose that data are drawn independently at random from a probability distribution P supported on the unit ball of a separable Hilbert space H. Let G(d.V.T)be the set of submanifolds of the unit ball of H whose volume is at most Vand reach (which is the supremum of all rsuch that any point at a distance less than rhas a unique nearest point on the manifold) is at least T. Let L(M.P) denote the mean-squared distance of a random point from the probability distribution Pto M. We obtain an algorithm that tests the manifold hypothesis in the following sense. The algorithm takes i.i.d. random samples from Pas input and determines which of the following two is true (at least one must be): There exists M ∈ G(d,CV,T/C) such that L(M,P)≤C∈ There exists no M ∈ G(d,V/C,CT) such that L(M,P) ≤ ∈/C The answer is correct with probability at least 1-δ. © 2016 American Mathmatical Society.
引用
收藏
页码:983 / 1049
页数:67
相关论文
共 56 条
  • [1] Amari Shun-ichi, 2000, TRANSLATIONS MATH MO, V191
  • [2] [Anonymous], 2004, KERNEL METHODS PATTE
  • [3] [Anonymous], 1984, C MODERN ANAL PROBAB
  • [4] Random Projections of Smooth Manifolds
    Baraniuk, Richard G.
    Wakin, Michael B.
    [J]. FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2009, 9 (01) : 51 - 77
  • [5] Laplacian eigenmaps for dimensionality reduction and data representation
    Belkin, M
    Niyogi, P
    [J]. NEURAL COMPUTATION, 2003, 15 (06) : 1373 - 1396
  • [6] Manifold Reconstruction in Arbitrary Dimensions Using Witness Complexes
    Boissonnat, Jean-Daniel
    Guibas, Leonidas J.
    Oudot, Steve Y.
    [J]. DISCRETE & COMPUTATIONAL GEOMETRY, 2009, 42 (01) : 37 - 70
  • [7] Bousquet O, 2004, LECT NOTES ARTIF INT, V3176, P169
  • [8] k-plane clustering
    Bradley, PS
    Mangasarian, OL
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2000, 16 (01) : 23 - 32
  • [9] Brand Matthew, 2003, Advances in Neural Information Processing Systems, P985
  • [10] Canas Guillermo., 2012, ADV NEURAL INFORM PR, V25, P2474