A robust and real-time DNN-based multi-baseline stereo accelerator in FPGAs

被引:3
作者
Zhang, Yu [1 ]
Zheng, Yi [1 ]
Ling, Yehua [1 ]
Meng, Haitao [1 ]
Chen, Gang [1 ]
机构
[1] Sun Yat sen Univ, Guangzhou, Peoples R China
关键词
Stereo vision; Hardware accelerator; Real-time; Resource-efficient; BNN; DEPTH ESTIMATION;
D O I
10.1016/j.sysarc.2023.102966
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stereo vision is essential to many robotic applications such as obstacle avoidance and 3D mapping in autonomous-driving cars, requiring substantial design efforts to achieve an appropriate balance between robustness and speed. Currently, most existing solutions for stereo vision accelerators are designed under twocamera stereo setups using handcrafted features for stereo estimation. This setup generally suffers from several drawbacks, including limited sensing range and high error rates. To resolve the above issues, we present a deep neural network based multi-baseline stereo accelerator for resource-constrained FPGAs, called Ultra-Stereo, that provides a robust and real-time solution for depth sensing. In Ultra-Stereo, binary neural network (BNN) is leveraged to obtain discriminative feature descriptors while a dynamic weighted cost fusion strategy helps to suppress the local minima in the matching cost function which effectively avoids incorrect stereo estimates. In addition, we provide a set of optimized hardware modules for Ultra-Stereo to estimate the depth map from multi-cameras in real-time. To process the data streams from multi-cameras efficiently, we first propose specific multi-baseline fusion hardware architectures against shorter and larger baseline stereo pairs. Then a time-sharing BNN is used to bridge the gap between processing speed and resource efficiency. Evaluation results demonstrate that Ultra-Stereo achieves higher matching accuracy while guaranteeing real-time responsiveness when compared with existing solutions.
引用
收藏
页数:10
相关论文
共 35 条
[1]   ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM [J].
Campos, Carlos ;
Elvira, Richard ;
Gomez Rodriguez, Juan J. ;
Montiel, Jose M. M. ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) :1874-1890
[2]   Efficient stereo matching on embedded GPUs with zero-means cross correlation [J].
Chang, Qiong ;
Zha, Aolong ;
Wang, Weimin ;
Liu, Xin ;
Onishi, Masaki ;
Lei, Lei ;
Er, Meng Joo ;
Maruyama, Tsutomu .
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 123
[3]   Real-Time Stereo Vision System: A Multi-Block Matching on CUP [J].
Chang, Qiong ;
Maruyama, Tsutomu .
IEEE ACCESS, 2018, 6 :42030-42046
[4]   StereoEngine: An FPGA-Based Accelerator for Real-Time High-Quality Stereo Estimation With Binary Neural Network [J].
Chen, Gang ;
Ling, Yehua ;
He, Tao ;
Meng, Haitao ;
He, Shengyu ;
Zhang, Yu ;
Huang, Kai .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) :4179-4190
[5]   GPU-Accelerated Real-Time Stereo Estimation With Binary Neural Network [J].
Chen, Gang ;
Meng, Haitao ;
Liang, Yucheng ;
Huang, Kai .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (12) :2896-2907
[6]  
da Silveira TLT, 2019, 2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), P9, DOI [10.1109/vr.2019.8798281, 10.1109/VR.2019.8798281]
[7]  
Denker K., 2011, Accurate real-time multi-camera stereo-matching on the gpu for 3d reconstruction
[8]  
Dosovitskiy A, 2017, PR MACH LEARN RES, V78
[9]   Fast multiple-baseline stereo with occlusion [J].
Drouin, MA ;
Trudeau, M ;
Roy, S .
FIFTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2005, :540-547
[10]  
Fan R, 2018, IEEE CONF IMAGING SY, P63