Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators

被引：2

作者：

Symons, Arne ^{[1
]}

Mei, Linyan ^{[1
]}

Colleman, Steven ^{[1
]}

Houshmand, Pouya ^{[1
]}

Karl, Sebastian ^{[1
,2
]}

Verhelst, Marian ^{[1
]}

机构：

[1] Katholieke Univ Leuven, Leuven, Belgium

[2] Tech Univ Munich, Munich, Germany

来源：

2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS | 2023年

关键词：

DNN; multi-core; accelerator; layer fusion; design space exploration;

D O I：

10.1109/ISPASS57527.2023.00051

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports fine-grained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4x for single-core architectures to up to 30x for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream.

引用

页码：355 / 357

页数：3

共 22 条

[1] Alwani M, 2016, INT SYMP MICROARCH
[2] CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories
Balasubramonian, Rajeev
Kahng, Andrew B.
Muralimanohar, Naveen
Shafiee, Ali
Srinivas, Vaishnav
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (02)
[3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[4] ShiDianNao: Shifting Vision Processing Closer to the Sensor
Du, Zidong
Fasthuber, Robert
Chen, Tianshi
Ienne, Paolo
Li, Ling
Luo, Tao
Feng, Xiaobing
Chen, Yunji
Temam, Olivier
[J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 92 - 104
[5] Garofalo A., 2022, arXiv
[6] Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks
Ghodrati, Soroush
Ahn, Byung Hoon
Kim, Joon Kyung
Kinzer, Sean
Yatham, Brahmendra Reddy
Alla, Navateja
Sharma, Hardik
Alian, Mohammad
Ebrahimi, Eiman
Kim, Nam Sung
Young, Cliff
Esmaeilzadeh, Hadi
[J]. 2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 681 - 697
[7] Goetschalckx K., 2021, PROC IEEE S VLSI CIR, P1
[8] Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution
Goetschalckx, Koen
Verhelst, Marian
[J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) : 323 - 331
[9] Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266
[10] Scalable and Programmable Neural Network Inference Accelerator Based on In-Memory Computing
Jia, Hongyang
Ozatay, Murat
Tang, Yinqi
Valavi, Hossein
Pathak, Rakshit
Lee, Jinseok
Verma, Naveen
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (01) : 198 - 211

← 1 2 3 →