Finding Low-Rank Solutions via Nonconvex Matrix Factorization, Efficiently and Provably

被引:38
作者
Park, Dohyung [1 ]
Kyrillidis, Anastasios [2 ,3 ]
Caramanis, Constantine [4 ]
Sanghavi, Sujay [4 ]
机构
[1] Facebook, Seattle, WA 98109 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10548 USA
[3] Rice Univ, Houston, TX 77005 USA
[4] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
关键词
nonconvex optimization; matrix factorization; low-rank minimization; MONTE-CARLO ALGORITHMS; LOCAL MINIMA; OPTIMIZATION; COMPLETION; RECOVERY; DECOMPOSITION; CONE;
D O I
10.1137/17M1150189
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A rank-r matrix X is an element of R-mxn can be written as a product UVT, where U is an element of R(mxr )and V is an element of R-nxr. One could exploit this observation in optimization: e.g., consider the minimization of a convex function f(X) over rank-r matrices, where the set of low-rank matrices is modeled via UVT . Though such parameterization reduces the number of variables and is more computationally efficient (of particular interest is the case r << min{m, n}), it comes at a cost: f(UVT) becomes a nonconvex function w.r.t. U and V. We study such parameterization on generic convex objectives f and focus on firstorder, gradient descent algorithms. We propose the bifactored gradient descent (BFGD) algorithm, an efficient first-order method that operates directly on the U, V factors. We show that when f is (restricted) smooth, BFGD has local sublinear convergence; when f is both (restricted) smooth and (restricted) strongly convex, it has local linear convergence. For several applications, we provide simple and efficient initialization schemes that provide initial conditions, good enough for the above convergence results to hold, globally. Extensive experimental results support our arguments that BFGD is an efficient and accurate nonconvex method, compared to state-of-the-art approaches.
引用
收藏
页码:2165 / 2204
页数:40
相关论文
共 109 条
[1]   The learnability of quantum states [J].
Aaronson, Scott .
PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2007, 463 (2088) :3089-3114
[2]   SINGULAR VALUE DECOMPOSITION (SVD) IMAGE-CODING [J].
ANDREWS, HC ;
PATTERSON, CL .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1976, 24 (04) :425-432
[3]  
[Anonymous], 2016, COMMUNICATION
[4]  
[Anonymous], 2001, NIPS
[5]  
[Anonymous], THESIS
[6]  
[Anonymous], PREPRINT
[7]  
[Anonymous], 2007, Advances in Neural Information Processing Systems
[8]  
[Anonymous], 2007, P KDD CUP WORKSHOP
[9]  
[Anonymous], 2016, C LEARNING THEORY
[10]  
[Anonymous], 2015, PREPRINT