Convergence of Alternating Gradient Descent for Matrix Factorization

被引:0
|
作者
Ward, Rachel [1 ]
Kolda, Tamara G. [2 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] MathSci Ai, Dublin, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider alternating gradient descent (AGD) with fixed step size applied to the asymmetric matrix factorization objective. We show that, for a rank-r matrix A is an element of R-mxn, T = C(sigma(1)(A)/sigma(r)(A))(2) log(1/epsilon) iterations of alternating gradient descent suffice to reach an epsilon-optimal factorization parallel to A - X-T Y-T(inverted perpendicular)parallel to(2)(F) <= parallel to A parallel to(2)(F) with high probability starting from an atypical random initialization. The factors have rank d >= r so that X-T is an element of R-mxd and Y-T is an element of R-nxd, and mild overparameterization suffices for the constant C in the iteration complexity T to be an absolute constant. Experiments suggest that our proposed initialization is not merely of theoretical benefit, but rather significantly improves the convergence rate of gradient descent in practice. Our proof is conceptually simple: a uniform Polyak-Lojasiewicz (PL) inequality and uniform Lipschitz smoothness constant are guaranteed for a sufficient number of iterations, starting from our random initialization. Our proof method should be useful for extending and simplifying convergence analyses for a broader class of nonconvex low-rank factorization problems.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Convergence of a Fast Hierarchical Alternating Least Squares Algorithm for Nonnegative Matrix Factorization
    Hou, Liangshao
    Chu, Delin
    Liao, Li-Zhi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 77 - 89
  • [22] ON THE CONVERGENCE OF DECENTRALIZED GRADIENT DESCENT
    Yuan, Kun
    Ling, Qing
    Yin, Wotao
    SIAM JOURNAL ON OPTIMIZATION, 2016, 26 (03) : 1835 - 1854
  • [23] Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
    Mukkamala, Mahesh Chandra
    Ochs, Peter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [24] Coordinate projected gradient descent minimization and its application to orthogonal nonnegative matrix factorization
    Chorobura, Flavia
    Lupu, Daniela
    Necoara, Ion
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 6929 - 6934
  • [25] A Non-Euclidean Gradient Descent Framework for Non-Convex Matrix Factorization
    Hsieh, Ya-Ping
    Kao, Yu-Chun
    Mahabadi, Rabeeh Karimi
    Yurtsever, Alp
    Kyrillidis, Anastasios
    Cevher, Volkan
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (22) : 5917 - 5926
  • [26] An Efficient Approach of GPU-accelerated Stochastic Gradient Descent Method for Matrix Factorization
    Li, Feng
    Ye, Yunming
    Li, Xutao
    JOURNAL OF INTERNET TECHNOLOGY, 2019, 20 (04): : 1087 - 1097
  • [27] A NEW APPROACH OF GPU-ACCELERATED STOCHASTIC GRADIENT DESCENT METHOD FOR MATRIX FACTORIZATION
    Li, Feng
    Ye, Yunming
    Li, Xutao
    Lu, Jiajie
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (02): : 697 - 711
  • [28] Gradient descent for deep matrix factorization: Dynamics and implicit bias towards low rank
    Chou, Hung-Hsu
    Gieshoff, Carsten
    Maly, Johannes
    Rauhut, Holger
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2024, 68
  • [29] Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot
    Jain, Prateek
    Jin, Chi
    Kakade, Sham M.
    Netrapalli, Praneeth
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 479 - 488
  • [30] ON THE ASYMPTOTIC LINEAR CONVERGENCE OF GRADIENT DESCENT FOR NON-SYMMETRIC MATRIX COMPLETION
    Trung Vu
    Raich, Raviv
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 1049 - 1053