PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints

被引:7
作者
Eswar, Srinivas [1 ]
Hayashi, Koby [1 ]
Ballard, Grey [2 ]
Kannan, Ramakrishnan [3 ]
Matheson, Michael A. [3 ]
Park, Haesun [1 ]
机构
[1] Georgia Inst Technol, Dept CSE, Atlanta, GA 30308 USA
[2] Wake Forest Univ, Dept CS, Winston Salem, NC 27109 USA
[3] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
来源
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE | 2021年 / 47卷 / 03期
基金
美国国家科学基金会;
关键词
Tensor factorization; nonnegative least squares; communication-avoiding algorithms; TENSOR DECOMPOSITIONS; COLLECTIVE COMMUNICATION; MATRIX; ALGORITHMS; FACTORIZATION; OPTIMIZATION; FRAMEWORK; SPARSE;
D O I
10.1145/3432185
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Non-negativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.
引用
收藏
页数:37
相关论文
共 69 条
[31]   MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization [J].
Kannan, Ramakrishnan ;
Ballard, Grey ;
Park, Haesun .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (03) :544-558
[32]   A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization [J].
Kannan, Ramakrishnan ;
Ballard, Grey ;
Park, Haesun .
ACM SIGPLAN NOTICES, 2016, 51 (08) :99-109
[33]  
KAYA O., 2017, High Performance Parallel Algorithms for Tensor Decompositions
[34]   Computing Dense Tensor Decompositions with Optimal Dimension Trees [J].
Kaya, Oguz ;
Robert, Yves .
ALGORITHMICA, 2019, 81 (05) :2092-2121
[35]   PARALLEL CANDECOMP/PARAFAC DECOMPOSITION OF SPARSE TENSORS USING DIMENSION TREES [J].
Kaya, Oguz ;
Ucar, Bora .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2018, 40 (01) :C99-C130
[36]   High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors [J].
Kaya, Oguz ;
Ucar, Bora .
PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, :103-112
[37]  
Kim DM, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P343
[38]   Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework [J].
Kim, Jingu ;
He, Yunlong ;
Park, Haesun .
JOURNAL OF GLOBAL OPTIMIZATION, 2014, 58 (02) :285-319
[39]   FAST NONNEGATIVE MATRIX FACTORIZATION: AN ACTIVE-SET-LIKE METHOD AND COMPARISONS [J].
Kim, Jingu ;
Park, Haesun .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2011, 33 (06) :3261-3281
[40]   Long-Term Optical Access to an Estimated One Million Neurons in the Live Mouse Cortex [J].
Kim, Tony Hyun ;
Zhang, Yanping ;
Lecoq, Jerome ;
Jung, Juergen C. ;
Li, Jane ;
Zeng, Hongkui ;
Niell, Cristopher M. ;
Schnitzer, Mark J. .
CELL REPORTS, 2016, 17 (12) :3385-3394