MOSCON: Modified Outer Product based Sparse Matrix-Matrix Multiplication Accelerator with Configurable Tiles

被引:3
作者
Noble, G. [1 ]
Nalesh, S. [2 ]
Kala, S. [1 ]
机构
[1] Indian Inst Informat Technol Kottayam, Dept Elect & Commun Engn, Kottayam, Kerala, India
[2] Cochin Univ Sci & Technol, Dept Elect, Cochin, Kerala, India
来源
2023 36TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2023 22ND INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, VLSID | 2023年
关键词
Deep learning; Sparse matrix multiplication; Execution time; FPGA accelerator;
D O I
10.1109/VLSID57277.2023.00061
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
General Sparse Matrix-Matrix Multiplication (SpGEMM) which involves product of two sparse matrices is a key operation in many deep learning algorithms. Sparse matrices consist of only a few non-zero elements which makes it inefficient to use conventional matrix multiplication algorithms. Hence, specialized architectures for sparse matrix multiplication have been proposed. Prior works in this field uses outer product based implementation and they suffer due to poor load balance in the processing elements. We propose a modified outer product based sparse matrix-matrix multiplication architecture with configurable tiles, referred as MOSCON, which can be accelerated on Field Programmable Gate Arrays (FPGA). MOSCON can perform sparse matrix multiplication of any dimensions and takes the advantages of outer product implementation along with the features of load balanced architecture. Proposed architecture has been implemented on Xilinx Kintex-7 FPGA device and gives an average performance gain of 9.21% when compared with state-of-the-art implementations.
引用
收藏
页码:264 / 269
页数:6
相关论文
共 16 条
  • [1] [Anonymous], 2021, SPAGHETTI STREAMING
  • [2] [Anonymous], 2021, GAMMA LEVERAGING GUS
  • [3] ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator
    Asgari, Bahar
    Hadidi, Ramyad
    Krishna, Tushar
    Kim, Hyesoon
    Yalamanchili, Sudhakar
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 249 - 260
  • [4] Choi U., 2022, IEEE T BIG DATA
  • [5] Gustavson F. G., 1978, ACM Transactions on Mathematical Software, V4, P250, DOI 10.1145/355791.355796
  • [6] Efficient CNN Accelerator on FPGA
    Kala, S.
    Nalesh, S.
    [J]. IETE JOURNAL OF RESEARCH, 2020, 66 (06) : 733 - 740
  • [7] Kala S., 2019, IEEE TVLSI
  • [8] Linghao S., 2022, ACMSIGDA FPGA
  • [9] CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
    Liu, Weifeng
    Vinter, Brian
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 339 - 350
  • [10] Mahesh M., 2021, 34 SOCC