OPT-GCN: A Unified and Scalable Chiplet-Based Accelerator for High-Performance and Energy-Efficient GCN Computation

被引：1

作者：

Zhao, Yingnan ^{[1
]}

Wang, Ke ^{[2
]}

Louri, Ahmed ^{[1
]}

机构：

[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA

[2] Univ North Carolina Charlotte, Dept Elect & Comp Engn, Charlotte, NC 28223 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2024年 / 43卷 / 12期

关键词：

Engines; System-on-chip; Vectors; Inference algorithms; Computer architecture; Energy efficiency; Design automation; Chiplet-based design; graph convolutional network (GCN); hardware accelerator; hardware-algorithm co-design; GRAPH NEURAL-NETWORK; CLASSIFICATION;

D O I：

10.1109/TCAD.2024.3401543

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As the size of real-world graphs continues to grow at an exponential rate, performing the graph convolutional network (GCN) inference efficiently is becoming increasingly challenging. Prior works that employ a unified computing engine with a predefined computation order lack the necessary flexibility and scalability to handle diverse input graph datasets. In this article, we introduce OPT-GCN, a chiplet-based accelerator design that performs GCN inference efficiently while providing flexibility and scalability through an architecture-algorithm co-design. On the architecture side, the proposed design integrates a unified computing engine in each chiplet and an active interposer, both of which are adaptable to efficiently perform the GCN inference and facilitate data communication. On the algorithm side, we propose dynamic scheduling and mapping algorithms to optimize memory access and on-chip computations for diverse GCN applications. Experimental results show that the proposed design provides a memory access reduction by a factor of 11.3x, 3.4x, and 1.4x, and energy savings of 15.2x, 3.7x, and 1.6x on average compared to HyGCN, AWB-GCN, and GCNAX, respectively.

引用

页码：4827 / 4840

页数：14

共 23 条

[1] A Flexible Hybrid Interconnection Design for High-Performance and Energy-Efficient Chiplet-Based Systems
Mahmud, Md Tareq
Wang, Ke
IEEE COMPUTER ARCHITECTURE LETTERS, 2024, 23 (02) : 215 - 218
[2] ABSX: The Chiplet Hyperscale AI Processing Unit for Energy-Efficient High-Performance AI Processing
Kwon, Youngsu
2023 20TH INTERNATIONAL SOC DESIGN CONFERENCE, ISOCC, 2023, : 217 - 218
[3] GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator
Deng, Chunhua
Sui, Yang
Liao, Siyu
Qian, Xuehai
Yuan, Bo
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 1110 - 1123
[4] Monolithic 3-D-Based Nonvolatile Associative Processor for High-Performance Energy-Efficient Computations
Garzon, Esteban
Bedoya, Alessandro
Lanuzza, Marco
Yavits, Leonid
IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2024, 10 : 40 - 48
[5] A High-Performance, Energy-Efficient Modular DMA Engine Architecture
Benz, Thomas
Rogenmoser, Michael
Scheffler, Paul
Riedel, Samuel
Ottaviano, Alessandro
Kurth, Andreas
Hoefler, Torsten
Benini, Luca
IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (01) : 263 - 277
[6] Zen: An Energy-Efficient High-Performance x86 Core
Singh, Teja
Schaefer, Alex
Rangarajan, Sundar
John, Deepesh
Henrion, Carson
Schreiber, Russell
Rodriguez, Miguel
Kosonocky, Stephen
Naffziger, Samuel
Novak, Amy
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (01) : 102 - 114
[7] Ameba: A High-performance and Energy-efficient Online Video Retrieval System
Yang, Jin
Pang, Jianmin
Yu, Jintao
Cao, Wei
2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 200 - 203
[8] Fast Pipelined Storage for High-Performance Energy-Efficient Computing with Superconductor Technology
Dorojevets, Mikhail
Chen, Zuoting
2015 12TH INTERNATIONAL CONFERENCE & EXPO ON EMERGING TECHNOLOGIES FOR A SMARTER WORLD (CEWIT), 2015,
[9] Energy-Efficient and High-Performance Lock Speculation Hardware for Embedded Multicore Systems
Papagiannopoulou, Dimitra
Capodanno, Giuseppe
Moreshet, Tali
Herlihy, Maurice
Bahar, R. Iris
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2015, 14 (03)
[10] Deep Reinforcement Learning Enabled Self-Configurable Networks-on-Chip for High-Performance and Energy-Efficient Computing Systems
Reza, Md Farhadur
IEEE ACCESS, 2022, 10 : 65339 - 65354

← 1 2 3 →