HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation

被引:44
作者
Ye, Hanchen [1 ]
Zhang, Xiaofan [1 ]
Huang, Zhize [2 ]
Chen, Gengsheng [2 ]
Chen, Deming [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Fudan Univ, Shanghai, Peoples R China
来源
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC) | 2020年
关键词
D O I
10.1109/dac18072.2020.9218684
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
To speedup Deep Neural Networks (DNN) accelerator design and enable effective implementation, we propose HybridDNN, a framework for building high-performance hybrid DNN accelerators and delivering FPGA-based hardware implementations. Novel techniques include a highly flexible and scalable architecture with a hybrid Spatial/Winograd convolution (CONV) Processing Engine (PE), a comprehensive design space exploration tool, and a complete design flow to fully support accelerator design and implementation. Experirnental results show that the accelerators generated by HybridDNN can deliver 3375.7 and 83.3 GOPS on a high-end FPGA (VU9P) and an embedded FPGA (PYNQ-Z1), respectively, which achieve a 1.8x higher performance improvement compared to the state-of-art accelerator designs. This demonstrates that HybridDNN is flexible and scalable and can target both cloud and embedded hardware platforms with vastly different resource constraints.
引用
收藏
页数:6
相关论文
共 26 条
[1]  
[Anonymous], 2016, P FPT
[2]   LOPASS: A Low-Power Architectural Synthesis System for FPGAs With Interconnect Estimation and Optimization [J].
Chen, Deming ;
Cong, Jason ;
Fan, Yiping ;
Wan, Lu .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (04) :564-577
[3]  
Chen Deming, 2005, SRC Techcon
[4]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[5]   Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs [J].
Chen, Yao ;
He, Jiong ;
Zhang, Xiaofan ;
Hao, Cong ;
Chen, Deming .
PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, :73-82
[6]  
Deo Manish, 2017, ENABLING NEXT GENERA
[7]   FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates [J].
Guan, Yijin ;
Liang, Hao ;
Xu, Ningyi ;
Wang, Wenqiang ;
Shi, Shaoshuai ;
Chen, Xi ;
Sun, Guangyu ;
Zhang, Wei ;
Cong, Jason .
2017 IEEE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2017), 2017, :152-159
[8]   FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge [J].
Hao, Cong ;
Zhang, Xiaofan ;
Li, Yuhong ;
Huang, Sitao ;
Xiong, Jinjun ;
Rupnow, Kyle ;
Hwu, Wen-mei ;
Chen, Deming .
PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[9]   Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search [J].
Jiang, Weiwen ;
Zhang, Xinyi ;
Sha, Edwin H-M ;
Yang, Lei ;
Zhuge, Qingfeng ;
Shi, Yiyu ;
Hu, Jingtong .
PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[10]   Fast Algorithms for Convolutional Neural Networks [J].
Lavin, Andrew ;
Gray, Scott .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4013-4021