StereoEngine: An FPGA-Based Accelerator for Real-Time High-Quality Stereo Estimation With Binary Neural Network

被引:28
作者
Chen, Gang [1 ]
Ling, Yehua [1 ]
He, Tao [2 ]
Meng, Haitao [2 ]
He, Shengyu [2 ]
Zhang, Yu [1 ]
Huang, Kai [1 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510275, Peoples R China
[2] Northeastern Univ, Sch Comp Sci & Engn, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
Binary neural network; FPGA accelerator; high-quality stereo estimation; real-time; ACCURATE;
D O I
10.1109/TCAD.2020.3012864
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stereo estimation is essential to many applications such as mobile autonomous robots, most of which ask for real-time response, high energy, and storage efficiency. Deep neural networks (DNNs) have shown to yield significant gains in improving accuracy. However, these DNN-based algorithms are challenging to be deployed on energy and resource-constrained devices due to the high computational complexities of DNNs. In this article, we present StereoEngine, a fully pipelined end-to-end stereo vision accelerator that computes accurate dense depth in a real-time and energy-efficient manner. An efficient stereo algorithm is developed and optimized for a high-quality hardware-friendly implementation, that leverages binary neural network (BNN) to learn discriminative binary descriptors to improve the disparity. The design of StereoEngine is a standalone DNN-based stereo vision system where all processing procedures are implemented on a hardware platform. The effectiveness of StereoEngine is evaluated by comprehensive experiments. Compared with software-based implementations on the high-end and embedded Nvidia GPUs, StereoEngine achieves up to 3x, 13x, and 50x speedups, as well as up to 211x, 58x, and 73x energy efficiency improvement, respectively. Furthermore, StereoEngine achieves leading accuracy when compared to state-of-the-art hardware implementations on the challenging KITTI dataset.
引用
收藏
页码:4179 / 4190
页数:12
相关论文
共 38 条
[1]  
[Anonymous], 2015, P BRIT MACH VIS C BM
[2]  
Bae KR, 2015, INT CONF UBIQ FUTUR, P240, DOI 10.1109/ICUFN.2015.7182542
[3]  
Banz Christian, 2010, Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (IC-SAMOS 2010), P93, DOI 10.1109/ICSAMOS.2010.5642077
[4]  
Brunton A., 2006, Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV06), P76
[5]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[6]   Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network [J].
Cheng, Xinjing ;
Wang, Peng ;
Yang, Ruigang .
COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :108-125
[7]   FPGA based real-time on-road stereo vision system [J].
Dehnavi, M. ;
Eshghi, M. .
JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 81 :32-43
[8]   A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things [J].
Du, Li ;
Du, Yuan ;
Li, Yilei ;
Su, Junjie ;
Kuan, Yen-Cheng ;
Liu, Chun-Chen ;
Chang, Mau-Chung Frank .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (01) :198-208
[9]  
Fan B, 2013, INT CONF ACOUST SPEE, P2395, DOI 10.1109/ICASSP.2013.6638084
[10]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074