Distributed-Memory Parallel JointNMF

被引：1

作者：

Eswar, Srinivas ^{[1
]}

Cobb, Benjamin ^{[2
]}

Hayashi, Koby ^{[2
]}

Kannan, Ramakrishnan ^{[3
]}

Ballard, Grey ^{[4
]}

Vuduc, Richard ^{[2
]}

Park, Haesun ^{[2
]}

机构：

[1] Argonne Natl Lab, Lemont, IL 60439 USA

[2] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA

[3] Oak Ridge Natl Lab, Oak Ridge, TN USA

[4] Wake Forest Univ, Dept Comp Sci, Winston Salem, NC 27101 USA

来源：

PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2023 | 2023年

基金：

美国能源部; 美国国家科学基金会;

关键词：

High Performance Computing; Multimodal Inputs; Nonnegative Matrix Factorization; NONNEGATIVE MATRIX; COMMUNICATION; MPI;

D O I：

10.1145/3577193.3593733

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.

引用

页码：301 / 312

页数：12

共 50 条

[1] Parallel feature selection for distributed-memory clusters
Gonzalez-Dominguez, Jorge
Bolon-Canedo, Veronica
Freire, Borja
Tourino, Juan
INFORMATION SCIENCES, 2019, 496 : 399 - 409
[2] Distributed-Memory Parallel Symmetric Nonnegative Matrix Factorization
Eswar, Srinivas
Hayashi, Koby
Ballard, Grey
Kannan, Ramakrishnan
Vuduc, Richard
Park, Haesun
PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
[3] Code Generation for Distributed-Memory Architectures
Zhao, Jie
Zhao, Rongcai
Xu, Jinchen
COMPUTER JOURNAL, 2016, 59 (01) : 119 - 132
[4] An object-oriented parallel programming language for distributed-memory parallel computing platforms
Pinho, Eduardo Gurgel
de Carvalho Junior, Francisco Heron
SCIENCE OF COMPUTER PROGRAMMING, 2014, 80 : 65 - 90
[5] Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures
Tumeo, Antonino
Villa, Oreste
Chavarria-Miranda, Daniel G.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (03) : 436 - 443
[6] A Parallel Approach for the Generation of Unstructured Meshes with Billions of Elements on Distributed-Memory Supercomputers
Wang, Xiao-qing
Jin, Xian-long
Kou, Da-zhi
Chen, Jia-hui
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2017, 45 (03) : 680 - 710
[7] Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments
Jin, Shuangshuang
Chen, Yousu
Wu, Di
Diao, Ruisheng
Huang, Zhenyu
IFAC PAPERSONLINE, 2015, 48 (30): : 221 - 226
[8] A Parallel Approach for the Generation of Unstructured Meshes with Billions of Elements on Distributed-Memory Supercomputers
Xiao-qing Wang
Xian-long Jin
Da-zhi Kou
Jia-hui Chen
International Journal of Parallel Programming, 2017, 45 : 680 - 710
[9] Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs
Siegfried Benkner
Viera Sipkova
International Journal of Parallel Programming, 2003, 31 : 3 - 19
[10] Exploiting distributed-memory and shared-memory parallelism on clusters of SMPs with data parallel programs
Benkner, S
Sipkova, V
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2003, 31 (01) : 3 - 19

← 1 2 3 4 5 →