OPT-GCN: A Unified and Scalable Chiplet-Based Accelerator for High-Performance and Energy-Efficient GCN Computation

被引:1
|
作者
Zhao, Yingnan [1 ]
Wang, Ke [2 ]
Louri, Ahmed [1 ]
机构
[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
[2] Univ North Carolina Charlotte, Dept Elect & Comp Engn, Charlotte, NC 28223 USA
关键词
Engines; System-on-chip; Vectors; Inference algorithms; Computer architecture; Energy efficiency; Design automation; Chiplet-based design; graph convolutional network (GCN); hardware accelerator; hardware-algorithm co-design; GRAPH NEURAL-NETWORK; CLASSIFICATION;
D O I
10.1109/TCAD.2024.3401543
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As the size of real-world graphs continues to grow at an exponential rate, performing the graph convolutional network (GCN) inference efficiently is becoming increasingly challenging. Prior works that employ a unified computing engine with a predefined computation order lack the necessary flexibility and scalability to handle diverse input graph datasets. In this article, we introduce OPT-GCN, a chiplet-based accelerator design that performs GCN inference efficiently while providing flexibility and scalability through an architecture-algorithm co-design. On the architecture side, the proposed design integrates a unified computing engine in each chiplet and an active interposer, both of which are adaptable to efficiently perform the GCN inference and facilitate data communication. On the algorithm side, we propose dynamic scheduling and mapping algorithms to optimize memory access and on-chip computations for diverse GCN applications. Experimental results show that the proposed design provides a memory access reduction by a factor of 11.3x, 3.4x, and 1.4x, and energy savings of 15.2x, 3.7x, and 1.6x on average compared to HyGCN, AWB-GCN, and GCNAX, respectively.
引用
收藏
页码:4827 / 4840
页数:14
相关论文
共 23 条
  • [1] A Flexible Hybrid Interconnection Design for High-Performance and Energy-Efficient Chiplet-Based Systems
    Mahmud, Md Tareq
    Wang, Ke
    IEEE COMPUTER ARCHITECTURE LETTERS, 2024, 23 (02) : 215 - 218
  • [2] ABSX: The Chiplet Hyperscale AI Processing Unit for Energy-Efficient High-Performance AI Processing
    Kwon, Youngsu
    2023 20TH INTERNATIONAL SOC DESIGN CONFERENCE, ISOCC, 2023, : 217 - 218
  • [3] GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator
    Deng, Chunhua
    Sui, Yang
    Liao, Siyu
    Qian, Xuehai
    Yuan, Bo
    2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 1110 - 1123
  • [4] Monolithic 3-D-Based Nonvolatile Associative Processor for High-Performance Energy-Efficient Computations
    Garzon, Esteban
    Bedoya, Alessandro
    Lanuzza, Marco
    Yavits, Leonid
    IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2024, 10 : 40 - 48
  • [5] A High-Performance, Energy-Efficient Modular DMA Engine Architecture
    Benz, Thomas
    Rogenmoser, Michael
    Scheffler, Paul
    Riedel, Samuel
    Ottaviano, Alessandro
    Kurth, Andreas
    Hoefler, Torsten
    Benini, Luca
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (01) : 263 - 277
  • [6] Zen: An Energy-Efficient High-Performance x86 Core
    Singh, Teja
    Schaefer, Alex
    Rangarajan, Sundar
    John, Deepesh
    Henrion, Carson
    Schreiber, Russell
    Rodriguez, Miguel
    Kosonocky, Stephen
    Naffziger, Samuel
    Novak, Amy
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (01) : 102 - 114
  • [7] Ameba: A High-performance and Energy-efficient Online Video Retrieval System
    Yang, Jin
    Pang, Jianmin
    Yu, Jintao
    Cao, Wei
    2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 200 - 203
  • [8] Fast Pipelined Storage for High-Performance Energy-Efficient Computing with Superconductor Technology
    Dorojevets, Mikhail
    Chen, Zuoting
    2015 12TH INTERNATIONAL CONFERENCE & EXPO ON EMERGING TECHNOLOGIES FOR A SMARTER WORLD (CEWIT), 2015,
  • [9] Energy-Efficient and High-Performance Lock Speculation Hardware for Embedded Multicore Systems
    Papagiannopoulou, Dimitra
    Capodanno, Giuseppe
    Moreshet, Tali
    Herlihy, Maurice
    Bahar, R. Iris
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2015, 14 (03)
  • [10] Deep Reinforcement Learning Enabled Self-Configurable Networks-on-Chip for High-Performance and Energy-Efficient Computing Systems
    Reza, Md Farhadur
    IEEE ACCESS, 2022, 10 : 65339 - 65354