Exemplar-based low-rank matrix decomposition for data clustering

被引:5
作者
Wang, Lijun [1 ]
Dong, Ming [2 ]
机构
[1] Wayne State Univ, Dept Comp Sci, Detroit, MI 48201 USA
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48201 USA
关键词
Clustering; Low-rank approximation; Subspaces; Matrix decomposition; MONTE-CARLO ALGORITHMS; NYSTROM METHOD; APPROXIMATIONS; FACTORIZATION; COMPUTATION;
D O I
10.1007/s10618-014-0347-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, digital data is accumulated at a faster than ever speed in science, engineering, biomedicine, and real-world sensing. The ubiquitous phenomenon of massive data and sparse information imposes considerable challenges in data mining research. In this paper, we propose a theoretical framework, Exemplar-based low-rank sparse matrix decomposition (EMD), to cluster large-scale datasets. Capitalizing on recent advances in matrix approximation and decomposition, EMD can partition datasets with large dimensions and scalable sizes efficiently. Specifically, given a data matrix, EMD first computes a representative data subspace and a near-optimal low-rank approximation. Then, the cluster centroids and indicators are obtained through matrix decomposition, in which we require that the cluster centroids lie within the representative data subspace. By selecting the representative exemplars, we obtain a compact "sketch"of the data. This makes the clustering highly efficient and robust to noise. In addition, the clustering results are sparse and easy for interpretation. From a theoretical perspective, we prove the correctness and convergence of the EMD algorithm, and provide detailed analysis on its efficiency, including running time and spatial requirements. Through extensive experiments performed on both synthetic and real datasets, we demonstrate the performance of EMD for clustering large-scale data.
引用
收藏
页码:324 / 357
页数:34
相关论文
共 50 条
  • [41] Multiscale Decomposition in Low-Rank Approximation
    Abdolali, Maryam
    Rahmati, Mohammad
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (07) : 1015 - 1019
  • [42] Multi-view spectral clustering based on adaptive neighbor learning and low-rank tensor decomposition
    Xiao, Qingjiang
    Du, Shiqiang
    Liu, Baokai
    Yu, Yao
    Song, Jinmei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 41159 - 41186
  • [43] Low-Rank Matrix Recovery With Simultaneous Presence of Outliers and Sparse Corruption
    Rahmani, Mostafa
    Atia, George K.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2018, 12 (06) : 1170 - 1181
  • [44] Adaptive Graph Regularized Low-Rank Matrix Factorization With Noise and Outliers for Clustering
    Zhao, Min
    Liu, Jinglei
    IEEE ACCESS, 2020, 8 (08): : 171851 - 171863
  • [45] Low-Rank and Sparse Matrix Decomposition with Cluster Weighting for Hyperspectral Anomaly Detection
    Zhu, Lingxiao
    Wen, Gongjian
    Qiu, Shaohua
    REMOTE SENSING, 2018, 10 (05):
  • [46] LOW-RANK UPDATES OF MATRIX FUNCTIONS
    Beckermann, Bernhard
    Kressner, Daniel
    Schweitzer, Marcel
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2018, 39 (01) : 539 - 565
  • [47] LERE: Learning-Based Low-Rank Matrix Recovery with Rank Estimation
    Xu, Zhengqin
    Zhang, Yulun
    Ma, Chao
    Yan, Yichao
    Peng, Zelin
    Xie, Shoulie
    Wu, Shiqian
    Yang, Xiaokang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 16228 - 16236
  • [48] On Low-Rank Hankel Matrix Denoising
    Yin, Mingzhou
    Smith, Roy S.
    IFAC PAPERSONLINE, 2021, 54 (07): : 198 - 203
  • [49] Interference Mitigation for FMCW Radar With Sparse and Low-Rank Hankel Matrix Decomposition
    Wang, Jianping
    Ding, Min
    Yarovoy, Alexander
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2022, 70 : 822 - 834
  • [50] Sparse and Low-Rank Decomposition of a Hankel Structured Matrix for Impulse Noise Removal
    Jin, Kyong Hwan
    Ye, Jong Chul
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1448 - 1461