Convergence of Alternating Gradient Descent for Matrix Factorization

被引：0

作者：

Ward, Rachel ^{[1
]}

Kolda, Tamara G. ^{[2
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

[2] MathSci Ai, Dublin, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider alternating gradient descent (AGD) with fixed step size applied to the asymmetric matrix factorization objective. We show that, for a rank-r matrix A is an element of R-mxn, T = C(sigma(1)(A)/sigma(r)(A))(2) log(1/epsilon) iterations of alternating gradient descent suffice to reach an epsilon-optimal factorization parallel to A - X-T Y-T(inverted perpendicular)parallel to(2)(F) <= parallel to A parallel to(2)(F) with high probability starting from an atypical random initialization. The factors have rank d >= r so that X-T is an element of R-mxd and Y-T is an element of R-nxd, and mild overparameterization suffices for the constant C in the iteration complexity T to be an absolute constant. Experiments suggest that our proposed initialization is not merely of theoretical benefit, but rather significantly improves the convergence rate of gradient descent in practice. Our proof is conceptually simple: a uniform Polyak-Lojasiewicz (PL) inequality and uniform Lipschitz smoothness constant are guaranteed for a sufficient number of iterations, starting from our random initialization. Our proof method should be useful for extending and simplifying convergence analyses for a broader class of nonconvex low-rank factorization problems.

引用

页数：14

共 50 条

[21] Convergence of a Fast Hierarchical Alternating Least Squares Algorithm for Nonnegative Matrix Factorization
Hou, Liangshao
Chu, Delin
Liao, Li-Zhi
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 77 - 89
[22] ON THE CONVERGENCE OF DECENTRALIZED GRADIENT DESCENT
Yuan, Kun
Ling, Qing
Yin, Wotao
SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (03) : 1835 - 1854
[23] Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
Mukkamala, Mahesh Chandra
Ochs, Peter
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[24] Coordinate projected gradient descent minimization and its application to orthogonal nonnegative matrix factorization
Chorobura, Flavia
Lupu, Daniela
Necoara, Ion
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 6929 - 6934
[25] A Non-Euclidean Gradient Descent Framework for Non-Convex Matrix Factorization
Hsieh, Ya-Ping
Kao, Yu-Chun
Mahabadi, Rabeeh Karimi
Yurtsever, Alp
Kyrillidis, Anastasios
Cevher, Volkan
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (22) : 5917 - 5926
[26] An Efficient Approach of GPU-accelerated Stochastic Gradient Descent Method for Matrix Factorization
Li, Feng
Ye, Yunming
Li, Xutao
JOURNAL OF INTERNET TECHNOLOGY, 2019, 20 (04): : 1087 - 1097
[27] A NEW APPROACH OF GPU-ACCELERATED STOCHASTIC GRADIENT DESCENT METHOD FOR MATRIX FACTORIZATION
Li, Feng
Ye, Yunming
Li, Xutao
Lu, Jiajie
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (02): : 697 - 711
[28] Gradient descent for deep matrix factorization: Dynamics and implicit bias towards low rank
Chou, Hung-Hsu
Gieshoff, Carsten
Maly, Johannes
Rauhut, Holger
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2024, 68
[29] Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot
Jain, Prateek
Jin, Chi
Kakade, Sham M.
Netrapalli, Praneeth
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 479 - 488
[30] ON THE ASYMPTOTIC LINEAR CONVERGENCE OF GRADIENT DESCENT FOR NON-SYMMETRIC MATRIX COMPLETION
Trung Vu
Raich, Raviv
FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 1049 - 1053

← 1 2 3 4 5 →