DyGA: A Hardware-Efficient Accelerator With Traffic-Aware Dynamic Scheduling for Graph Convolutional Networks

被引：0

作者：

Xie, Ruiqi ^{[1
]}

Yin, Jun ^{[1
]}

Han, Jun ^{[1
]}

机构：

[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai 201203, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2021年 / 68卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Sparse matrices; Convolutional neural networks; Dynamic scheduling; Deep learning; Speech recognition; Hardware acceleration; Graph convolutional networks; hardware acceleration; domain-specific architecture; NEURAL-NETWORKS;

D O I：

10.1109/TCSI.2021.3112826

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the growing applications of Graph Convolutional Networks (GCN), there is also an increasing demand for its efficient hardware acceleration. Compared with CNN tasks, GCN tasks have new challenges such as randomness, sparsity, and nonuniformity, which will lead to poor performance of previous AI accelerators. In this paper, we propose DyGA, a hardware-efficient GCN accelerator, which is featured by strategies of graph partitioning, customized storage policy, traffic-aware dynamic scheduling, and out-of-order execution. Synthesized and evaluated under TSMC 28-nm, the accelerator achieves an average throughput of over 95% of its peak performance with full utilization of hardware on representative graph data sets. Having a high area-efficiency with 0.217 GOPS/K-logic-gates and 8.06 GOPS/KB-PE-buffer, and thus an energy-efficiency of 384GOPS/W, the proposed accelerator outperforms previous state-of-the-art works in the sparse data processing.

引用

页码：5095 / 5107

页数：13

共 45 条

[1] Anders M, 2018, SYMP VLSI CIRCUITS, P39, DOI 10.1109/VLSIC.2018.8502333
[2] [Anonymous], 2013, INT C LEARNING REPRE
[3] [Anonymous], 2016, TensorFlow: large-scale machine learning on heterogeneous distributed systems
[4] Fast and Efficient Convolutional Accelerator for Edge Computing
Ardakani, Arash
Condo, Carlo
Gross, Warren J.
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (01) : 138 - 152
[5] Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
Chen, Yu-Hsin
Yange, Tien-Ju
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) : 292 - 308
[6] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[7] De Cao Nicola, 2018, ICML 2018 WORKSH THE
[8] Defferrard M, 2016, P ADV NEUR INF PROC, P3844
[9] Dorrance R., 2016, P IEEE S VLSI CIRC V, P1
[10] Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
Gomez-Bombarelli, Rafael
Wei, Jennifer N.
Duvenaud, David
Hernandez-Lobato, Jose Miguel
Sanchez-Lengeling, Benjamin
Sheberla, Dennis
Aguilera-Iparraguirre, Jorge
Hirzel, Timothy D.
Adams, Ryan P.
Aspuru-Guzik, Alan
[J]. ACS CENTRAL SCIENCE, 2018, 4 (02) : 268 - 276

← 1 2 3 4 5 →