Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization

被引:32
作者
Fu, Xiao [1 ]
Ibrahim, Shahana [1 ]
Wai, Hoi-To [2 ]
Gao, Cheng [1 ,3 ]
Huang, Kejun [4 ]
机构
[1] Oregon State Univ, Sch Elect Engn & Comp Sci, Corvallis, OR 97331 USA
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
[3] Univ Missouri, Columbia, MO 65211 USA
[4] Univ Florida, Dept Comp & Informat Sci & Engn, Gainesville, FL 32611 USA
基金
美国国家科学基金会;
关键词
Large-scale tensor decomposition; canonical polyadic decomposition; stochastic gradient; Adagrad; LEAST-SQUARES; DECOMPOSITIONS; OPTIMIZATION; IDENTIFIABILITY; CONVERGENCE; METHODOLOGY; ALGORITHMS; MIXTURES;
D O I
10.1109/TSP.2020.2982321
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This work considers the problem of computing the canonical polyadic decomposition (CPD) of large tensors. Prior works leverage data sparsity to handle this problem, which is not suitable for handling dense tensors that often arise in applications such as medical imaging, computer vision, and remote sensing. Stochastic optimization is known for its low memory cost and per-iteration complexity when handling dense data. However, existing stochastic CPD algorithms are not flexible to incorporate a variety of constraints/regularization terms that are of interest in signal and data analytics. Convergence properties of many such algorithms are also unclear. In this work, we propose a stochastic optimization framework for large-scale CPD with constraints/regularization terms. The framework works under a doubly randomized fashion, and can be regarded as a judicious combination of randomized block coordinate descent (BCD) and stochastic proximal gradient (SPG). The algorithm enjoys lightweight updates and small memory footprint. This framework entails considerable flexibility-many frequently used regularizers and constraints can be readily handled. The approach is supported by convergence analysis. Numerical results on large-scale dense tensors are presented to showcase the effectiveness of the proposed approach.
引用
收藏
页码:2170 / 2185
页数:16
相关论文
共 61 条
  • [1] Diffusion tensor imaging of the brain
    Alexander, Andrew L.
    Lee, Jee Eun
    Lazar, Mariana
    Field, Aaron S.
    [J]. NEUROTHERAPEUTICS, 2007, 4 (03) : 316 - 329
  • [2] Anandkumar A, 2014, J MACH LEARN RES, V15, P2773
  • [3] [Anonymous], 1985, HERBERT ROBBINS SELE, DOI DOI 10.1007/978-1-4612-5110-1_9
  • [4] [Anonymous], ARXIV14070107
  • [5] [Anonymous], [No title captured]
  • [6] [Anonymous], 2012, C LEARNING THEORY
  • [7] [Anonymous], 2012, MACH LEARN
  • [8] [Anonymous], [No title captured]
  • [9] A PRACTICAL RANDOMIZED CP TENSOR DECOMPOSITION
    Battaglino, Casey
    Ballard, Grey
    Kolda, Tamara G.
    [J]. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2018, 39 (02) : 876 - 901
  • [10] ON THE CONVERGENCE OF BLOCK COORDINATE DESCENT TYPE METHODS
    Beck, Amir
    Tetruashvili, Luba
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2013, 23 (04) : 2037 - 2060