Scalable and robust sparse subspace clustering using randomized clustering and multilayer graphs

被引:26
作者
Abdolali, Maryam [1 ]
Gillis, Nicolas [2 ]
Rahmati, Mohammad [1 ]
机构
[1] Amirkabir Univ Technol, Dept Comp Engn & Informat Technol, Tehran, Iran
[2] Univ Mons, Fac Polytech, Dept Math & Operat Res, Rue Houdain 9, B-7000 Mons, Belgium
基金
欧洲研究理事会;
关键词
Sparse subspace clustering; Randomized clustering; Hierarchical clustering; Multilayer graph; Spectral clustering;
D O I
10.1016/j.sigpro.2019.05.017
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sparse subspace clustering (SSC) is a state-of-the-art method for partitioning data points into the union of subspaces. However, it is not practical for large datasets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC, we propose to select a few sets of anchor points using a randomized hierarchical clustering method, and, for each set of anchor points, solve the LASSO problems for each data point allowing only anchor points to have a non-zero weight. This generates a multilayer graph where each layer corresponds to a set of anchor points. Using the Grassmann manifold of orthogonal matrices, the shared connectivity among the layers is summarized within a single subspace. Finally, we use k-means clustering within that subspace to cluster the data points, as done by SSC. We show on both synthetic and real-world datasets that the proposed method not only allows SSC to scale to large-scale datasets, but that it is also much more robust as it performs significantly better on noisy data and on data with close susbspaces and outliers, while it is not prone to oversegmentation. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:166 / 180
页数:15
相关论文
共 57 条
[1]  
[Anonymous], FOUND TRENDS MACH LE
[2]  
[Anonymous], 1987, CLUSTERING MEANS MED
[3]  
[Anonymous], TECH REP
[4]   Lambertian reflectance and linear subspaces [J].
Basri, R ;
Jacobs, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (02) :218-233
[5]  
Boutsidis C, 2008, P 14 ACM SIGKDD INT
[6]  
Boutsidis C, 2009, PROCEEDINGS OF THE TWENTIETH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P968
[7]   From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images [J].
Bruckstein, Alfred M. ;
Donoho, David L. ;
Elad, Michael .
SIAM REVIEW, 2009, 51 (01) :34-81
[8]   Invariant Scattering Convolution Networks [J].
Bruna, Joan ;
Mallat, Stephane .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1872-1886
[9]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[10]   RANK REVEALING QR FACTORIZATIONS [J].
CHAN, TF .
LINEAR ALGEBRA AND ITS APPLICATIONS, 1987, 88-9 :67-82