lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${l_p}$$\end{document}-Recovery of the Most Significant Subspace Among Multiple Subspaces with Outliers

被引:0
作者
Gilad Lerman
Teng Zhang
机构
[1] University of Minnesota,Department of Mathematics
关键词
Best approximating subspace; minimization; Robust statistics; Optimization on the Grassmannian; Geometric probability; Hybrid linear modeling; Primary 68Q32; 62G35; 60D05; Secondary 62-07; 68T10;
D O I
10.1007/s00365-014-9242-6
中图分类号
学科分类号
摘要
We assume data sampled from a mixture of d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d$$\end{document}-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity, we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document}-averaged distances of data points from d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d$$\end{document}-dimensional subspaces of RD\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb R^D$$\end{document}, where 0<p∈R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 < p \in \mathbb R$$\end{document}. Unlike other lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document} minimization problems, this minimization is nonconvex for all p>0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p>0$$\end{document} and thus requires different methods for its analysis. We show that if 0<p≤1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0<p \le 1$$\end{document}, then for any fraction of outliers, the most significant subspace can be recovered by lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document} minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces, the most significant subspace can be nearly recovered by lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document} minimization for any 0<p≤1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0<p \le 1$$\end{document} with an error proportional to the noise level. On the other hand, if p>1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p>1$$\end{document} and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.
引用
收藏
页码:329 / 385
页数:56
相关论文
共 54 条
  • [1] Arias-Castro E(2005)Connect the dots: how many random points can a regular curve pass through? Adv. Appl. Probab. 37 571-603
  • [2] Donoho DL(2011)Spectral clustering based on local linear approximations Electron. J. Stat. 5 1537-1587
  • [3] Huo X(1993)Orthogonal linear regression algorithm based on augmented matrix formulation Comput. Oper. Res. 20 829-836
  • [4] Tovey CA(2011)Robust principal component analysis? J. ACM 58 11-509
  • [5] Arias-Castro E(2006)Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information IEEE Trans. Inf. Theory 52 489-1223
  • [6] Chen G(2006)Stable signal recovery from incomplete and inaccurate measurements Commun. Pure Appl. Math. 59 1207-145
  • [7] Lerman G(1991)Singular integrals and rectifiable sets in Astérisque 193 1-253
  • [8] Bargiela A(1935): au-delà des graphes Lipschitziens Nature 135 917-934
  • [9] Hartley JK(1987)The minimum in the gamma function Comput. Stat. Data Anal. 5 239-829
  • [10] Candès EJ(2006)An introduction to Commun. Pure Appl. Math. 59 907-395