Clock Gating-Based Effectual Realization of Stochastic Hyperbolic Tangent Function for Deep Neural Hardware Accelerators

被引:0
作者
Gunjan Rajput
V. Logashree
Kunika Naresh Biyani
Santosh Kumar Vishvakarma
机构
[1] Indian Institute of Technology Indore,Electrical Engineering
来源
Circuits, Systems, and Signal Processing | 2023年 / 42卷
关键词
Activation function; Deep neural network; Hyperbolic tangent (Tanh); Clock gating; Stochastic computing; VLSI implementation;
D O I
暂无
中图分类号
学科分类号
摘要
Comprehensive neural network applications led to the customization of a scheme to accelerate the computation on ASIC implementation. Hence, the determination of activation function in a neural network is an indispensable requisite. However, the specific design architecture of an activation function in a digital network encounters several difficulties as these activation functions demand additional hardware resources due to their non-linearity. This paper proposed an efficient hyperbolic tangent (tanh) function, wholly based on stochastic Computing methodology. The Hyperbolic tangent implementation is backed by the clock gating technique to curtail the dynamic power dissipation. The results are derived by implementing two different clock gating techniques on the proposed hardware. In this work, the proposed clock gating-based stochastic design for the implementation of activation function is efficient in terms of performance parameters such as area, power, and delay with negligible accuracy loss. MNIST dataset has been used for checking accuracy on LeNeT benchmark architecture. Furthermore, post-synthesis results show that the proposed clock gating design area is reduced by ≈\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx $$\end{document} 70.62%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}, power is reduced by ≈\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx $$\end{document} 58.19%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document}, and delay is reduced by ≈\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx $$\end{document} 98.87%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} compared to the state of the art.
引用
收藏
页码:5978 / 6000
页数:22
相关论文
共 55 条
[1]  
Benini L(2000)A survey of design techniques for system-level dynamic power management IEEE Trans. Very Large Scale Integr. Syst. 8 299-316
[2]  
Bogliolo A(2004)Prediction of CTL epitopes using QM, SVM and ANN techniques Vaccine 22 3195-3204
[3]  
De Micheli G(2001)Stochastic neural computation. I. Computational elements IEEE Trans. Comput. 50 891-905
[4]  
Bhasin M(1969)Stochastic computing systems Adv. Inf. Syst. Sci. 2 37-172
[5]  
Raghava GP(2017)Angel-eye: a complete design flow for mapping CNN onto embedded FPGA IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37 35-47
[6]  
Brown BD(2011)A review of clock gating techniques MIT Int. J. Electron. Commun. Eng. 1 106-114
[7]  
Card HC(2012)Logical computation on stochastic bit streams with linear finite-state machines IEEE Trans. Comput. 63 1474-1486
[8]  
Gaines BR(2012)A memory-efficient tables-and-additions method for accurate computation of elementary functions IEEE Trans. Comput. 62 858-872
[9]  
Guo K(2019)A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection IEEE Trans. Very Large Scale Integr. Syst. 27 1861-1873
[10]  
Sui L(2021)VLSI implementation of transcendental function hyperbolic tangent for deep neural network accelerators Microprocess. Microsyst. 84 104270-368