Analysis & Design of Convolution Operator for High Speed and High Accuracy Convolutional Neural Network-Based Inference Engines

被引:5
作者
Deepika, S. [1 ]
Arunachalam, V [1 ]
机构
[1] Vellore Inst Technol, Dept Micro & Nano Elect, Vellore 632014, Tamil Nadu, India
关键词
Hardware; Convolution; Computational modeling; Analytical models; Matlab; Error analysis; Data models; Convolution neural network; convolution operator; data representation; inference engine; range and error analysis;
D O I
10.1109/TC.2021.3051627
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Inference engines (IEs) based on convolutional neural networks (CNN) are memory intensive and computationally complex. The IEs require optimum data format for representing kernel weights and feature maps (FMs) to reduce the computational complexity of the convolution operator (CO). The proposed CO implements multiplications in floating-point and additions in fixed-point, considering the implementation aspects and loss-of-precision. Here, the optimal data format is decided through MATLAB-based range and precision analysis. To accomplish this image-based models such as AlexNet, VGG-16, and VGG-19 are considered with single-precision floating-point representation (SPFP) as reference representation. Analysis reveals that half-precision floating-point (HPFP) and 16-bit fixed-point (10-bit integer, 6-bit fraction) representations are required for kernel weights and feature maps respectively. A 16-bit Fix/Float 2x1 CO is designed. A trade-off analysis of the CO with proposed data format, 16-bit fixed-point, SPFP, and HPFP is performed. This CO has worst-case accuracy as close as 97 percent with SPFP. The proposed 2x1 CO is implemented with a multiplication operation processing unit (MOPU) in place of the shifter/barrel-shifter unit. ASIC implementation of CO requires a 22 percent lesser area, 17.98 percent lesser power than HPFP with 250MHz clock. Also, the speed of 750MOPS and hardware efficiency of 24.22TOPS/W are achieved.
引用
收藏
页码:390 / 396
页数:7
相关论文
共 28 条
[1]  
Ahmed HO, 2018, 2018 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2018), P31, DOI 10.1109/ISCAIE.2018.8405440
[2]  
[Anonymous], 2015, P INT C LEARN REPR I
[3]  
[Anonymous], 2017, Deep convolutional neural network inference with floating-point weights and fixed-point activations
[4]   Fast and Efficient Convolutional Accelerator for Edge Computing [J].
Ardakani, Arash ;
Condo, Carlo ;
Gross, Warren J. .
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (01) :138-152
[5]   An Architecture to Accelerate Convolution in Deep Neural Networks [J].
Ardakani, Arash ;
Condo, Carlo ;
Ahmadi, Mehdi ;
Gross, Warren J. .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (04) :1349-1362
[6]   Efficient Implementations of Reduced Precision Redundancy (RPR) Multiply and Accumulate (MAC) [J].
Chen, Ke ;
Chen, Linbin ;
Reviriego, Pedro ;
Lombardi, Fabrizio .
IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (05) :784-790
[7]   Energy-Efficient Design of Processing Element for Convolutional Neural Network [J].
Choi, Yeongjae ;
Bae, Dongmyung ;
Sim, Jaehyeong ;
Choi, Seungkyu ;
Kim, Minhye ;
Kim, Lee-Sup .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2017, 64 (11) :1332-1336
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]   Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks [J].
Gysel, Philipp ;
Pimentel, Jon ;
Motamedi, Mohammad ;
Ghiasi, Soheil .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) :5784-5789
[10]   A Survey of Deep Learning: Platforms, Applications and Emerging Rlesearch Trends [J].
Hatcher, William Grant ;
Yu, Wei .
IEEE ACCESS, 2018, 6 :24411-24432