FPGA-Based Acceleration for Bayesian Convolutional Neural Networks

被引:25
作者
Fan, Hongxiang [1 ]
Ferianc, Martin [2 ]
Que, Zhiqiang [1 ]
Liu, Shuanglong [3 ]
Niu, Xinyu [4 ]
Rodrigues, Miguel R. D. [2 ]
Luk, Wayne [1 ]
机构
[1] Imperial Coll London, Dept Comp, London SW7 2AZ, England
[2] UCL, Dept Elect & Elect Engn, London WC1E 6BT, England
[3] Hunan Normal Univ, Sch Phys & Elect, Changsha 410081, Peoples R China
[4] Corerain Technol Ltd, Shenzhen 518048, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Bayesian convolutional neural network (BayesCNN); deep learning; field-programmable gate array (FPGA); three-dimensional convolutional neural network (3-D CNN);
D O I
10.1109/TCAD.2022.3160948
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Neural networks (NNs) have demonstrated their potential in a variety of domains ranging from computer vision (CV) to natural language processing. Among various NNs, two-dimensional (2-D) and three-dimensional (3-D) convolutional NNs (CNNs) have been widely adopted for a broad spectrum of applications, such as image classification and video recognition, due to their excellent capabilities in extracting 2-D and 3-D features. However, standard 2-D and 3-D CNNs are not able to capture their model uncertainty which is crucial for many safety-critical applications, including healthcare and autonomous driving. In contrast, Bayesian CNNs (BayesCNNs), as a variant of CNNs, have demonstrated their ability to express uncertainty in their prediction via a mathematical grounding. Nevertheless, BayesCNNs have not been widely used in industrial practice due to their compute requirements stemming from sampling and subsequent forward passes through the whole network multiple times. As a result, these requirements significantly increase the amount of computation and memory consumption in comparison to standard CNNs. This article proposes a novel field-programmable gate array (FPGA)-based hardware architecture to accelerate both 2-D and 3-D BayesCNNs based on Monte Carlo dropout (MCD). Compared with other state-of-the-art accelerators for BayesCNNs, the proposed design can achieve up to four times higher energy efficiency and nine times better compute efficiency. An automatic framework capable of supporting partial Bayesian inference is proposed to explore the tradeoff between algorithm and hardware performance. Extensive experiments are conducted to demonstrate that our framework can effectively find the optimal implementations in the design space.
引用
收藏
页码:5343 / 5356
页数:14
相关论文
共 67 条
[1]  
Andraka R., 1998, MIL AER APPL PROGR D, P220
[2]  
Awano H, 2020, DES AUT TEST EUROPE, P1402, DOI 10.23919/DATE48585.2020.9116302
[3]  
Azevedo T, 2020, Arxiv, DOI arXiv:2009.02967
[4]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[5]  
Cai RZ, 2018, ACM SIGPLAN NOTICES, V53, P476, DOI [10.1145/3296957.3173212, 10.1145/3173162.3173212]
[6]  
Dai JF, 2016, ADV NEUR IN, V29
[7]  
Daxberger E, 2022, Arxiv, DOI arXiv:2010.14689
[8]   Bayesian 3D ConvNets for Action Recognition from Few Examples [J].
de la Riva, Martin ;
Mettes, Pascal .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1337-1343
[9]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[10]  
Fan H., 2017, 2017 IEEE International Magnetics Conference (INTERMAG), DOI 10.1109/INTMAG.2017.8007987