Convergence of Alternating Gradient Descent for Matrix Factorization

被引：0

作者：

Ward, Rachel ^{[1
]}

Kolda, Tamara G. ^{[2
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

[2] MathSci Ai, Dublin, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider alternating gradient descent (AGD) with fixed step size applied to the asymmetric matrix factorization objective. We show that, for a rank-r matrix A is an element of R-mxn, T = C(sigma(1)(A)/sigma(r)(A))(2) log(1/epsilon) iterations of alternating gradient descent suffice to reach an epsilon-optimal factorization parallel to A - X-T Y-T(inverted perpendicular)parallel to(2)(F) <= parallel to A parallel to(2)(F) with high probability starting from an atypical random initialization. The factors have rank d >= r so that X-T is an element of R-mxd and Y-T is an element of R-nxd, and mild overparameterization suffices for the constant C in the iteration complexity T to be an absolute constant. Experiments suggest that our proposed initialization is not merely of theoretical benefit, but rather significantly improves the convergence rate of gradient descent in practice. Our proof is conceptually simple: a uniform Polyak-Lojasiewicz (PL) inequality and uniform Lipschitz smoothness constant are guaranteed for a sufficient number of iterations, starting from our random initialization. Our proof method should be useful for extending and simplifying convergence analyses for a broader class of nonconvex low-rank factorization problems.

引用

页数：14

共 50 条

[1] Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Ye, Tian
Du, Simon S.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[2] Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
Li, Yuanzhi
Liang, Yingyu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[3] An alternating projected gradient algorithm for nonnegative matrix factorization
Lin, Lu
Liu, Zhong-Yun
APPLIED MATHEMATICS AND COMPUTATION, 2011, 217 (24) : 9997 - 10002
[4] Accelerating Stochastic Gradient Descent Based Matrix Factorization on FPGA
Zhou, Shijie
Kannan, Rajgopal
Prasanna, Viktor K.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (08) : 1897 - 1911
[5] Nonnegative Matrix Factorization Using Autoencoders and Exponentiated Gradient Descent
El Khatib, Alaa
Huang, Shimeng
Ghodsi, Ali
Karray, Fakhri
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
[6] Preconditioning Matters: Fast Global Convergence of Non-convex Matrix Factorization via Scaled Gradient Descent
Jia, Xixi
Wang, Hailin
Peng, Jiangjun
Feng, Xiangchu
Meng, Deyu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[7] Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization
Zhang, Gavin
Fattahi, Salar
Zhang, Richard Y.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[8] Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization
Zhou, Dongruo
Cao, Yuan
Gu, Quanquan
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4430 - 4439
[9] Parallelizing Stochastic Gradient Descent with Hardware Transactional Memory for Matrix Factorization
Wu, Zhenwei
Luo, Yingqi
Lu, Kai
Wang, Xiaoping
2018 3RD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS ENGINEERING (ICISE), 2018, : 118 - 121
[10] Efficient Parallel Stochastic Gradient Descent for Matrix Factorization Using GPU
Nassar, Mohamed A.
El-Sayed, Layla A. A.
Taha, Yousry
2016 11TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2016, : 63 - 68

← 1 2 3 4 5 →