Hardware-Software Codesign of a CNN Accelerator

被引:3
作者
Yi, Changjae [1 ]
Kang, Donghyun [2 ]
Ha, Soonhoi [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
[2] Samsung Elect, Suwon, South Korea
来源
2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD) | 2022年
关键词
Neural processing unit; HW/SW codesign; CNN accelerator;
D O I
10.1109/DSD57027.2022.00054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The explosive growth of deep learning applications based on convolutional neural network (CNN) in embedded systems is spurring the development of a hardware CNN accelerator, called a neural processing unit (NPU). In this work, we present how the hardware-software codesign methodology could be applied to the design of a novel adder-type NPU. After devising a baseline datapath that enables fully-pipelined execution of layers, we define a high-level behavior model based on which a high-level compiler and a virtual prototyping system are built concurrently. Since it is easy to change the microarchitecture of an NPU by modifying the simulation models of the hardware modules, we could explore the design space of NPU microarchitecture easily. In addition, we could evaluate the effect of hardware extensions to support various types of non-convolutional operations that recent CNN models use widely. After the final datapath is determined, we design the control structure and low-level compiler and implement the NPU prototype. Implementation results on an FPGA prototype show the viability of the proposed methodology and its outcome.
引用
收藏
页码:348 / 356
页数:9
相关论文
共 24 条
[1]  
ARM Limited. Company, AMB TLM LIB DEV GUID
[2]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[3]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[4]   Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [J].
Chen, Yu-Hsin ;
Emer, Joel ;
Sze, Vivienne .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :367-379
[5]   HARDWARE-SOFTWARE CODESIGN OF EMBEDDED SYSTEMS [J].
CHIODO, M ;
GIUSTO, P ;
JURECSKA, A ;
HSIEH, HC ;
SANGIOVANNIVINCENTELLI, A ;
LAVAGNO, L .
IEEE MICRO, 1994, 14 (04) :26-36
[6]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Searching for MobileNetV3 [J].
Howard, Andrew ;
Sandler, Mark ;
Chu, Grace ;
Chen, Liang-Chieh ;
Chen, Bo ;
Tan, Mingxing ;
Wang, Weijun ;
Zhu, Yukun ;
Pang, Ruoming ;
Vasudevan, Vijay ;
Le, Quoc V. ;
Adam, Hartwig .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324
[9]  
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[10]   Sparsity-Aware and Re-configurable NPU Architecture for Samsung Flagship Mobile SoC [J].
Jang, Jun-Woo ;
Lee, Sehwan ;
Kim, Dongyoung ;
Park, Hyunsun ;
Ardestani, Ali Shafiee ;
Choi, Yeongjae ;
Kim, Channoh ;
Kim, Yoojin ;
Yu, Hyeongseok ;
Abdel-Aziz, Hamzah ;
Park, Jun-Seok ;
Lee, Heonsoo ;
Lee, Dongwoo ;
Kim, Myeong Woo ;
Jung, Hanwoong ;
Nam, Heewoo ;
Lim, Dongguen ;
Lee, Seungwon ;
Song, Joon-Ho ;
Kwon, Suknam ;
Hassoun, Joseph ;
Lim, SukHwan ;
Choi, Changkyu .
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, :15-28