Blessing of Dimensionality: Recovering Mixture Data via Dictionary Pursuit

被引:64
作者
Liu, Guangcan [1 ,2 ,3 ]
Liu, Qingshan [1 ]
Li, Ping [3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Informat & Control, B DAT, Nanjing 210044, Jiangsu, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Sch Informat & Control, CICAEET, Nanjing 210044, Jiangsu, Peoples R China
[3] Rutgers State Univ, Dept Stat & Biostat, Piscataway, NJ 08854 USA
基金
美国国家科学基金会;
关键词
low-rank representation; incoherent condition; dictionary learning; matrix factorization; subspace clustering; ROBUST PCA; SUBSPACE; FACTORIZATION; SEGMENTATION;
D O I
10.1109/TPAMI.2016.2539946
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the problem of recovering the authentic samples that lie on a union of multiple subspaces from their corrupted observations. Due to the high-dimensional and massive nature of today's data-driven community, it is arguable that the target matrix (i.e., authentic sample matrix) to recover is often low-rank. In this case, the recently established Robust Principal Component Analysis (RPCA) method already provides us a convenient way to solve the problem of recovering mixture data. However, in general, RPCA is not good enough because the incoherent condition assumed by RPCA is not so consistent with the mixture structure of multiple subspaces. Namely, when the subspace number grows, the row-coherence of data keeps heightening and, accordingly, RPCA degrades. To overcome the challenges arising from mixture data, we suggest to consider LRR in this paper. We elucidate that LRR can well handle mixture data, as long as its dictionary is configured appropriately. More precisely, we mathematically prove that LRR can weaken the dependence on the row-coherence, provided that the dictionary is well-conditioned and has a rank of not too high. In particular, if the dictionary itself is sufficiently low-rank, then the dependence on the row-coherence can be completely removed. These provide some elementary principles for dictionary learning and naturally lead to a practical algorithm for recovering mixture data. Our experiments on randomly generated matrices and real motion sequences show promising results.
引用
收藏
页码:47 / 60
页数:14
相关论文
共 34 条
[1]  
[Anonymous], 2002, THESIS STANFORD U
[2]  
[Anonymous], 2012, P INT C ART INT STAT
[3]  
[Anonymous], P AMS C MATH CHALL 2
[4]  
[Anonymous], 2014, Proceedings of the 28th Annual Conference on Neural Information Processing Systems
[5]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[6]   Matrix Completion With Noise [J].
Candes, Emmanuel J. ;
Plan, Yaniv .
PROCEEDINGS OF THE IEEE, 2010, 98 (06) :925-936
[7]   Exact Matrix Completion via Convex Optimization [J].
Candes, Emmanuel J. ;
Recht, Benjamin .
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2009, 9 (06) :717-772
[8]   Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification [J].
Chen, Dong ;
Cao, Xudong ;
Wen, Fang ;
Sun, Jian .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :3025-3032
[9]   Spectral Curvature Clustering (SCC) [J].
Chen, Guangliang ;
Lerman, Gilad .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2009, 81 (03) :317-330
[10]   A multibody factorization method for independently moving objects [J].
Costeira, JP ;
Kanade, T .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1998, 29 (03) :159-179