AxP: A HW-SW Co-Design Pipeline for Energy-Efficient Approximated ConvNets via Associative Matching

被引:1
作者
Mocerino, Luca [1 ]
Calimera, Andrea [1 ]
机构
[1] Politecn Torino, Dept Control & Comp Engn, I-10129 Turin, Italy
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 23期
关键词
deep learning; convolutional neural networks; energy efficiency; data reuse; clustering; hw design; LEARNING ALGORITHMS; NEURAL-NETWORKS;
D O I
10.3390/app112311164
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The reduction in energy consumption is key for deep neural networks (DNNs) to ensure usability and reliability, whether they are deployed on low-power end-nodes with limited resources or high-performance platforms that serve large pools of users. Leveraging the over-parametrization shown by many DNN models, convolutional neural networks (ConvNets) in particular, energy efficiency can be improved substantially preserving the model accuracy. The solution proposed in this work exploits the intrinsic redundancy of ConvNets to maximize the reuse of partial arithmetic results during the inference stages. Specifically, the weight-set of a given ConvNet is discretized through a clustering procedure such that the largest possible number of inner multiplications fall into predefined bins; this allows an off-line computation of the most frequent results, which in turn can be stored locally and retrieved when needed during the forward pass. Such a reuse mechanism leads to remarkable energy savings with the aid of a custom processing element (PE) that integrates an associative memory with a standard floating-point unit (FPU). Moreover, the adoption of an approximate associative rule based on a partial bit-match increases the hit rate over the pre-computed results, maximizing the energy reduction even further. Results collected on a set of ConvNets trained for computer vision and speech processing tasks reveal that the proposed associative-based hw-sw co-design achieves up to 77% in energy savings with less than 1% in accuracy loss.
引用
收藏
页数:17
相关论文
共 56 条
[1]   Convolutional Neural Networks for Speech Recognition [J].
Abdel-Hamid, Ossama ;
Mohamed, Abdel-Rahman ;
Jiang, Hui ;
Deng, Li ;
Penn, Gerald ;
Yu, Dong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545
[2]  
[Anonymous], Google Speech Commands Dataset
[3]   Survey of Neural Text Representation Models [J].
Babic, Karlo ;
Martincic-Ipsic, Sanda ;
Mestrovic, Ana .
INFORMATION, 2020, 11 (11) :1-32
[4]   CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories [J].
Balasubramonian, Rajeev ;
Kahng, Andrew B. ;
Muralimanohar, Naveen ;
Shafiee, Ali ;
Srinivas, Vaishnav .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (02)
[5]  
Bhardwaj K., 2019, 190507072 ARXIV
[6]  
Blalock D., 2021, ARXIV
[7]  
Camus V, 2016, PROC EUR SOLID-STATE, P465, DOI 10.1109/ESSCIRC.2016.7598342
[8]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[9]  
Daoudi S., 2021, Ing. Syst. Inf., V26, P59, DOI 10.18280/isi.260106
[10]   Designing Custom Arithmetic Data Paths with FloPoCo [J].
de Dinechin, Florent ;
Pasca, Bogdan .
IEEE DESIGN & TEST OF COMPUTERS, 2011, 28 (04) :18-27