OPT-GCN: A Unified and Scalable Chiplet-Based Accelerator for High-Performance and Energy-Efficient GCN Computation

被引：1

作者：

Zhao, Yingnan ^{[1
]}

Wang, Ke ^{[2
]}

Louri, Ahmed ^{[1
]}

机构：

[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA

[2] Univ North Carolina Charlotte, Dept Elect & Comp Engn, Charlotte, NC 28223 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2024年 / 43卷 / 12期

关键词：

Engines; System-on-chip; Vectors; Inference algorithms; Computer architecture; Energy efficiency; Design automation; Chiplet-based design; graph convolutional network (GCN); hardware accelerator; hardware-algorithm co-design; GRAPH NEURAL-NETWORK; CLASSIFICATION;

D O I：

10.1109/TCAD.2024.3401543

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As the size of real-world graphs continues to grow at an exponential rate, performing the graph convolutional network (GCN) inference efficiently is becoming increasingly challenging. Prior works that employ a unified computing engine with a predefined computation order lack the necessary flexibility and scalability to handle diverse input graph datasets. In this article, we introduce OPT-GCN, a chiplet-based accelerator design that performs GCN inference efficiently while providing flexibility and scalability through an architecture-algorithm co-design. On the architecture side, the proposed design integrates a unified computing engine in each chiplet and an active interposer, both of which are adaptable to efficiently perform the GCN inference and facilitate data communication. On the algorithm side, we propose dynamic scheduling and mapping algorithms to optimize memory access and on-chip computations for diverse GCN applications. Experimental results show that the proposed design provides a memory access reduction by a factor of 11.3x, 3.4x, and 1.4x, and energy savings of 15.2x, 3.7x, and 1.6x on average compared to HyGCN, AWB-GCN, and GCNAX, respectively.

引用

页码：4827 / 4840

页数：14

共 23 条

[21] RUPA: A High Performance, Energy Efficient Accelerator for Rule-Based Password Generation in Heterogenous Password Recovery System
Zhang, Zhendong
Liu, Peng
Wang, Weidong
Jiang, Yingtao
IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (04) : 900 - 913
[22] Energy-efficient high-speed packet access in WCDMA systems with smart packet dispatching: Performance with TCP/IP based applications
Phan, VV
Luong, DD
Baghaie, R
Glisic, S
2003 IEEE 58TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS1-5, PROCEEDINGS, 2003, : 1632 - 1636
[23] 20nm High-K Metal Gate Heterogeneous 64-bit Quad-core CPUs and Hexa-core GPU for High-performance and Energy-efficient Mobile Application Processor
Lee, Hoi-Jin
Shin, Youngmin
Bae, Sung-il
Kim, Min-su
Kim, Kwangil
Pyo, Jungyul
Yun, Sunghee
Son, Jae Cheol
Kang, Inyup
2015 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2015, : 145 - 146

← 1 2 3 →