A GEOMETRIC ANALYSIS OF SUBSPACE CLUSTERING WITH OUTLIERS

被引:274
作者
Soltanolkotabi, Mahdi [1 ]
Candes, Emmanuel J. [1 ]
机构
[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Subspace clustering; spectral clustering; outlier detection; l(1) minimization; duality in linear programming; geometric functional analysis; properties of convex bodies; concentration of measure; MOTION SEGMENTATION; MODELS;
D O I
10.1214/12-AOS1034
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower-dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named sparse subspace clustering (SSC) [In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 (2009) 2790-2797. IEEE], which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretical analysis and demonstrates the effectiveness of these methods.
引用
收藏
页码:2195 / 2238
页数:44
相关论文
共 51 条
[1]  
Agarwal P. K., 2004, P 23 ACM SIGMOD SIGA, P155
[2]   On the isotropy constant of random convex sets [J].
Alonso-Gutierrez, David .
PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 2008, 136 (09) :3293-3300
[3]  
[Anonymous], INT J COMPUT VIS
[4]  
[Anonymous], SUBSPACE SPARS UNPUB
[5]  
[Anonymous], C COMP VIS PATT REC
[6]  
[Anonymous], IEEE T PATTERN ANAL
[7]  
[Anonymous], 2010, P 26 INT C MACH LEAR
[8]  
[Anonymous], AS C COMP VIS 7 12
[9]  
[Anonymous], LECT GEOMETRIC UNPUB
[10]  
[Anonymous], STAT METHODS VIDEO P