GAS: General-Purpose In-Memory-Computing Accelerator for Sparse Matrix Multiplication

被引：1

作者：

Zhang, Xiaoyu ^{[1
,2
]}

Li, Zerun ^{[1
,2
]}

Liu, Rui ^{[1
,3
]}

Chen, Xiaoming ^{[1
,2
]}

Han, Yinhe ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

[3] Xiangtan Univ, Sch Mat Sci & Engn, Xiangtan 411105, Hunan, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2024年 / 73卷 / 06期

关键词：

Sparse matrices; FeFETs; Computer architecture; Arrays; Nonvolatile memory; Microprocessors; Vectors; Sparse matrix multiplication; in-memory computing; SpMV; SpMSpV; SpMM; SpMSpM;

D O I：

10.1109/TC.2024.3371790

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sparse matrix multiplication is widely used in various practical applications. Different accelerators have been proposed to speed up sparse matrix-dense vector multiplication (SpMV), sparse matrix-sparse vector multiplication (SpMSpV), sparse matrix-dense matrix multiplication (SpMM), and sparse matrix-sparse matrix multiplication (SpMSpM). The performance of traditional sparse matrix multiplication accelerators is typically bounded by memory access due to the poor data locality and irregular memory access. In-memory computing (IMC) is a promising technique to alleviate the memory bottleneck. Previous IMC studies are mostly focused on accelerating a single sparse matrix multiplication function. In this paper, we propose GAS, a general-purpose IMC accelerator for sparse matrix multiplication. GAS integrates non-volatile memory based content-addressable memory (CAM) arrays and multiply-add computation (MAC) arrays to support sparse matrices represented in the double-precision floating-point format. Using a unified outer product based multiplication methodology, GAS supports the acceleration of SpMV, SpMSpv, SpMM, and SpMSpM. We further propose four optimization techniques to speed up the computation of GAS. GAS achieves significant speedups and energy savings over central processing unit (CPU) and graphics processing unit (GPU) implementations. Compared with state-of-the-art traditional and IMC-based accelerators, GAS not only supports more functions, but also achieves higher performance and energy efficiency.

引用

页码：1427 / 1441

页数：15

共 43 条

[31] Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations
Srivastava, Nitish
Jin, Hanchen
Smith, Shaden
Rong, Hongbo
Albonesi, David
Zhang, Zhiru
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 689 - 702
[32] SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator
Xie, Xinfeng
Liang, Zheng
Gu, Peng
Basak, Abanti
Deng, Lei
Liang, Ling
Hu, Xing
Xie, Yuan
[J]. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 570 - 583
[33] Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration
Xin, Jie
Ye, Xianqi
Zheng, Long
Wang, Qinggang
Huang, Yu
Yao, Pengcheng
Yu, Linchen
Liao, Xiaofei
Jin, Hai
[J]. 2021 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2021,
[34] PIMGCN: A ReRAM-Based PIM Design for Graph Convolutional Network Acceleration
Yang, Tao
Li, Dongyue
Han, Yibo
Zhao, Yilong
Liu, Fangxin
Liang, Xiaoyao
He, Zhezhi
Jiang, Li
[J]. 2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 583 - 588
[35] Yavits L, 2017, Arxiv, DOI arXiv:1705.09937
[36] Yi Shan, 2010, 2010 IEEE 8th Symposium on Application Specific Processors (SASP 2010), P64, DOI 10.1109/SASP.2010.5521144
[37] An Ultra-Dense 2FeFET TCAM Design Based on a Multi-Domain FeFET Model
Yin, Xunzhao
Ni, Kai
Reis, Dayane
Datta, Suman
Niemier, Michael
Hu, Xiaobo Sharon
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (09) : 1577 - 1581
[38] FSPA: An FeFET-based Sparse Matrix-Dense Vector Multiplication Accelerator
Zhang, Xiaoyu
Li, Zerun
Liu, Rui
Chen, Xiaoming
Han, Yinhe
[J]. 2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[39] Re-FeMAT: A Reconfigurable Multifunctional FeFET-Based Memory Architecture
Zhang, Xiaoyu
Liu, Rui
Song, Tao
Yang, Yuxin
Han, Yinhe
Chen, Xiaoming
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 5071 - 5084
[40] FeMAT: Exploring In-Memory Processing in Multifunctional FeFET-based Memory Array
Zhang, Xiaoyu
Chen, Xiaoming
Han, Yinhe
[J]. 2019 IEEE 37TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2019), 2019, : 541 - 549

← 1 2 3 4 5 →