On the Computational and Statistical Complexity of Over-parameterized Matrix Sensing

被引:0
作者
Zhuo, Jiacheng [1 ]
Kwon, Jeongyeol [2 ]
Ho, Nhat [3 ]
Caramanis, Constantine [4 ]
机构
[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Wisconsin Madison, Wisconsin Inst Discovery, Madison, WI 53705 USA
[3] Univ Texas, Dept Stat & Data Sci, Austin, TX 78712 USA
[4] Univ Texas, Dept Elect & Comp Engn, Austin, TX 78712 USA
关键词
Computational complexity; Statistical complexity; Over-parameterization; Matrix sensing; Matrix regression; Factorized gradient descent; RANK SOLUTIONS; FACTORIZATION; CONVERGENCE; ALGORITHM; RECOVERY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider solving the low-rank matrix sensing problem with the Factorized Gradient Descent (FGD) method when the specified rank is larger than the true rank. We refer to this as over-parameterized matrix sensing. If the ground truth signal X * is an element of R (d x d) is of rank r, but we try to recover it using FF (T) where F is an element of R (d x k) and k > r, the existing statistical analysis either no longer holds or produces a vacuous statistical error upper bound (infinity) due to the flat local curvature of the loss function around the global maxima. By decomposing the factorized matrix F into separate column spaces to capture the impact of using k > r, we show that || F (t) F (t) - X * ||(F) (2) converges sub-linearly to a statistical error of (O) over tilde ( kd sigma (2) /n ) after (O) over tilde ( sigma r /sigma root n/ d ) iterations, where F (t) is the output of FGD after t iterations, sigma (2) is the sigma variance of the observation noise, sigma (r) is the r-th largest eigenvalue of X * , and n is the number of samples. With a precise characterization of the convergence behavior and the statistical error, our results, therefore, offer a comprehensive picture of the statistical and computational complexity if we solve the over-parameterized matrix sensing problem with FGD.
引用
收藏
页码:1 / 47
页数:47
相关论文
共 44 条
[1]   STATISTICAL GUARANTEES FOR THE EM ALGORITHM: FROM POPULATION TO SAMPLE-BASED ANALYSIS [J].
Balakrishnan, Sivaraman ;
Wainwrightt, Martin J. ;
Yu, Bin .
ANNALS OF STATISTICS, 2017, 45 (01) :77-120
[2]  
Bhojanapalli S, 2016, Arxiv, DOI arXiv:1605.07221
[3]   Local minima and convergence in low-rank semidefinite programming [J].
Burer, S ;
Monteiro, RDC .
MATHEMATICAL PROGRAMMING, 2005, 103 (03) :427-444
[4]   A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization [J].
Burer, S ;
Monteiro, RDC .
MATHEMATICAL PROGRAMMING, 2003, 95 (02) :329-357
[5]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[6]   Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements [J].
Candes, Emmanuel J. ;
Plan, Yaniv .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (04) :2342-2359
[7]  
Chen YD, 2015, Arxiv, DOI arXiv:1509.03025
[8]   Low-Rank Matrix Recovery From Errors and Erasures [J].
Chen, Yudong ;
Jalali, Ali ;
Sanghavi, Sujay ;
Caramanis, Constantine .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (07) :4324-4337
[9]   Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview [J].
Chi, Yuejie ;
Lu, Yue M. ;
Chen, Yuxin .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (20) :5239-5269
[10]   SINGULARITY, MISSPECIFICATION AND THE CONVERGENCE RATE OF EM [J].
Dwivedi, Raaz ;
Nhat Ho ;
Khamaru, Koulik ;
Wainwright, Martin J. ;
Jordan, Michael, I ;
Yu, Bin .
ANNALS OF STATISTICS, 2020, 48 (06) :3161-3182