SCV-GNN: Sparse Compressed Vector-Based Graph Neural Network Aggregation

被引：1

作者：

Unnikrishnan, Nanda K. ^{[1
]}

Gould, Joe ^{[1
]}

Parhi, Keshab K. ^{[1
]}

机构：

[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 12期

基金：

美国国家科学基金会;

关键词：

Sparse matrices; Graph neural networks; Hardware; Memory management; Indexes; Vector processors; Accelerator architectures; Neural networks; aggregation; graph neural networks (GNNs); neural network inference; ACCELERATOR;

D O I：

10.1109/TCAD.2023.3291672

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graph neural networks (GNNs) have emerged as a powerful tool to process graph-based data in fields like communication networks, molecular interactions, chemistry, social networks, and neuroscience. GNNs are characterized by the ultrasparse nature of their adjacency matrix that necessitates the development of dedicated hardware beyond general-purpose sparse matrix multipliers. While there has been extensive research on designing dedicated hardware accelerators for GNNs, few have extensively explored the impact of the sparse storage format on the efficiency of the GNN accelerators. This article proposes SCV-GNN with the novel sparse compressed vectors (SCVs) format optimized for the aggregation operation. We use $Z$ -Morton ordering to derive a data-locality-based computation ordering and partitioning scheme. This article also presents how the proposed SCV-GNN is scalable on a vector processing system. Experimental results over various datasets show that the proposed method achieves a geometric mean speedup of $7.96\times $ and $7.04\times $ over compressed sparse column (CSC) and compressed sparse row (CSR) aggregation operations, respectively. The proposed method also reduces the memory traffic by a factor of $3.29\times $ and $4.37\times $ over CSC and CSR, respectively. Thus, the proposed novel aggregation format reduces the latency and memory access for GNN inference.

引用

页码：4803 / 4816

页数：14

共 69 条

[1] Computing Graph Neural Networks: A Survey from Algorithms to Accelerators
Abadal, Sergi
Jain, Akshay
Guirado, Robert
Lopez-Alonso, Jorge
Alarcon, Eduard
[J]. ACM COMPUTING SURVEYS, 2022, 54 (09)
[2] Abi-Karam S, 2022, Arxiv, DOI arXiv:2201.08475
[3] [Anonymous], 2020, NVIDIA A100 Tensor Core GPU
[4] [Anonymous], 2022, Synopsys
[5] Rabbit Order: Just-in-time Parallel Reordering for Fast Graph Analysis
Arai, Junya
Shiokawa, Hiroaki
Yamamuro, Takeshi
Onizuka, Makoto
Iwamura, Sotetsu
[J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 22 - 31
[6] Hardware Acceleration of Graph Neural Networks
Auten, Adam
Tomei, Matthew
Kumar, Rakesh
[J]. PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[7] A Template-Based Design Methodology for Graph-Parallel Hardware Accelerators
Ayupov, Andrey
Yesil, Serif
Ozdal, Muhammet Mustafa
Kim, Taemin
Burns, Steven
Ozturk, Ozcan
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (02) : 420 - 430
[8] Buluç A, 2009, SPAA'09: PROCEEDINGS OF THE TWENTY-FIRST ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, P233
[9] Rubik: A Hierarchical Architecture for Efficient Graph Neural Network Training
Chen, Xiaobing
Wang, Yuke
Xie, Xinfeng
Hu, Xing
Basak, Abanti
Liang, Ling
Yan, Mingyu
Deng, Lei
Ding, Yufei
Du, Zidong
Xie, Yuan
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 936 - 949
[10] Accelerating Graph-Connected Component Computation With Emerging Processing-In-Memory Architecture
Chen, Xuhang
Wang, Xueyan
Jia, Xiaotao
Yang, Jianlei
Qu, Gang
Zhao, Weisheng
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5333 - 5342

← 1 2 3 4 5 6 7 →