HeNCoG: A Heterogeneous Near-memory Computing Architecture for Energy Efficient GCN Acceleration

被引:0
作者
Hwang, Seung-Eon [1 ]
Song, Duyeong [1 ]
Park, Jongsun [1 ]
机构
[1] Korea Univ, Sch Elect Engn, Seoul, South Korea
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
基金
新加坡国家研究基金会;
关键词
Graph Convolutional Network; Sparse Matrix Multiplication; Near-memory Computing; Domain Specific Accelerator; PERFORMANCE;
D O I
10.1109/ISCAS58744.2024.10558133
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graph convolutional network (GCN), which first applies convolutional operations to process graph data, has gained attention in various tasks involving relational data. Previous GCN accelerators have been designed with heterogeneous cores, considering two stages of inference (aggregation and combination), or with a unified core based on the inference of multi layer as an iterative sparse-dense matrix multiplication. However, those prior works have suffered from an unnecessary large number of multiply-accumulate (MAC) operations and/or main memory accesses. In this paper, we propose HeNCoG, a GCN accelerator that utilizes a heterogeneous MAC array core for the combination stage and a near-memory computing core for the aggregation stage. In HeNCoG, considering that the number of MAC operations is significantly reduced when changing the stage execution order, the combination stage is executed first with a row-stationary dataflow. In the aggregation stage, magneto-resistive random-access memory (MRAM)-based near-memory computing is employed to reduce the number of main memory accesses needed to access the adjacency matrix in the graph dataset. Graph partitioning and double buffering techniques are also applied to further improve hardware efficiencies. Simulation results show that the HeNCoG architecture reduces execution cycles by 97% and memory accesses by 42% compared to previous works.
引用
收藏
页数:5
相关论文
共 16 条
  • [1] A 1-Mb 28-nm 1T1MTJ STT-MRAM With Single-Cap Offset-Cancelled Sense Amplifier and In Situ Self-Write-Termination
    Dong, Qing
    Wang, Zhehong
    Lim, Jongyup
    Zhang, Yiqun
    Sinangil, Mahmut E.
    Shih, Yi-Chun
    Chih, Yu-Der
    Chang, Jonathan
    Blaauw, David
    Sylvester, Dennis
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 231 - 239
  • [2] NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory
    Dong, Xiangyu
    Xu, Cong
    Xie, Yuan
    Jouppi, Norman P.
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2012, 31 (07) : 994 - 1007
  • [3] Fey M., 2020, ARXIV
  • [4] Gustavson F. G., 1978, ACM Transactions on Mathematical Software, V4, P250, DOI 10.1145/355791.355796
  • [5] Hamilton W. L., 2017, ADVANCEMENTS NEURAL
  • [6] Hwang R., 2023, IEEE INT S HIGH PERF
  • [7] KaHIP, 2023, KAHIP
  • [8] Kipf T. N., 2017, P ICLR
  • [9] Li J., 2021, IEEE INT S HIGH PERF
  • [10] OConnor M., 2014, MEM FOR WORKSH