Convergence of Alternating Gradient Descent for Matrix Factorization

被引:0
|
作者
Ward, Rachel [1 ]
Kolda, Tamara G. [2 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] MathSci Ai, Dublin, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider alternating gradient descent (AGD) with fixed step size applied to the asymmetric matrix factorization objective. We show that, for a rank-r matrix A is an element of R-mxn, T = C(sigma(1)(A)/sigma(r)(A))(2) log(1/epsilon) iterations of alternating gradient descent suffice to reach an epsilon-optimal factorization parallel to A - X-T Y-T(inverted perpendicular)parallel to(2)(F) <= parallel to A parallel to(2)(F) with high probability starting from an atypical random initialization. The factors have rank d >= r so that X-T is an element of R-mxd and Y-T is an element of R-nxd, and mild overparameterization suffices for the constant C in the iteration complexity T to be an absolute constant. Experiments suggest that our proposed initialization is not merely of theoretical benefit, but rather significantly improves the convergence rate of gradient descent in practice. Our proof is conceptually simple: a uniform Polyak-Lojasiewicz (PL) inequality and uniform Lipschitz smoothness constant are guaranteed for a sufficient number of iterations, starting from our random initialization. Our proof method should be useful for extending and simplifying convergence analyses for a broader class of nonconvex low-rank factorization problems.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
    Ye, Tian
    Du, Simon S.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
    Li, Yuanzhi
    Liang, Yingyu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [3] An alternating projected gradient algorithm for nonnegative matrix factorization
    Lin, Lu
    Liu, Zhong-Yun
    APPLIED MATHEMATICS AND COMPUTATION, 2011, 217 (24) : 9997 - 10002
  • [4] Accelerating Stochastic Gradient Descent Based Matrix Factorization on FPGA
    Zhou, Shijie
    Kannan, Rajgopal
    Prasanna, Viktor K.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (08) : 1897 - 1911
  • [5] Nonnegative Matrix Factorization Using Autoencoders and Exponentiated Gradient Descent
    El Khatib, Alaa
    Huang, Shimeng
    Ghodsi, Ali
    Karray, Fakhri
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [6] Preconditioning Matters: Fast Global Convergence of Non-convex Matrix Factorization via Scaled Gradient Descent
    Jia, Xixi
    Wang, Hailin
    Peng, Jiangjun
    Feng, Xiangchu
    Meng, Deyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization
    Zhang, Gavin
    Fattahi, Salar
    Zhang, Richard Y.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization
    Zhou, Dongruo
    Cao, Yuan
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4430 - 4439
  • [9] Parallelizing Stochastic Gradient Descent with Hardware Transactional Memory for Matrix Factorization
    Wu, Zhenwei
    Luo, Yingqi
    Lu, Kai
    Wang, Xiaoping
    2018 3RD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS ENGINEERING (ICISE), 2018, : 118 - 121
  • [10] Efficient Parallel Stochastic Gradient Descent for Matrix Factorization Using GPU
    Nassar, Mohamed A.
    El-Sayed, Layla A. A.
    Taha, Yousry
    2016 11TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2016, : 63 - 68