Depth-Guided Aggregation for Real-Time Binocular Depth Estimation Network

被引:0
作者
Fu, Dongxin [1 ]
Zheng, Shaowu [1 ]
Xie, Pengcheng [1 ]
Li, Weihua [1 ]
机构
[1] South China Univ Technol, Guangzhou, Peoples R China
关键词
Costs; Estimation; Feature extraction; Three-dimensional displays; Convolution; Real-time systems; Data mining; Cameras;
D O I
10.1109/MMUL.2024.3395695
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Using binocular cameras to obtain depth information of target pixels offers a cost-effective and natural alternative to lidar systems. However, most of the current binocular depth estimation networks have difficulty achieving a better balance between speed and accuracy in real-world situations, and their prediction accuracy for long-range depth is often limited. In this article, we introduce the end-to-end real-time depth estimation network (RTDENet), which efficiently utilizes multiscale cost volumes for improved performance. We propose an efficient and flexible cost aggregation module that supplements residual information with high-resolution cost volumes. By replacing some computationally demanding 3-D convolutional layers with depth-guided excitation, we maintain accuracy while effectively controlling model computation. Alongside the distance-sensitive loss function, RTDENet achieves a global difference of 2.41 m and an inference time of 27 ms on the KITTI Stereo dataset. This balance of speed and accuracy outperforms other state-of-the-art algorithms in depth estimation tasks.
引用
收藏
页码:36 / 47
页数:12
相关论文
共 20 条
[1]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[2]  
Cheng X., 2020, P ADV NEUR INF PROC, P169
[3]  
Chuah W, 2020, Arxiv, DOI arXiv:2009.04629
[4]   Group-wise Correlation Stereo Network [J].
Guo, Xiaoyang ;
Yang, Kai ;
Yang, Wukui ;
Wang, Xiaogang ;
Li, Hongsheng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3268-3277
[5]   End-to-End Learning of Geometry and Context for Deep Stereo Regression [J].
Kendall, Alex ;
Martirosyan, Hayk ;
Dasgupta, Saumitro ;
Henry, Peter ;
Kennedy, Ryan ;
Bachrach, Abraham ;
Bry, Adam .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :66-75
[6]   StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction [J].
Khamis, Sameh ;
Fanello, Sean ;
Rhemann, Christoph ;
Kowdle, Adarsh ;
Valentin, Julien ;
Izadi, Shahram .
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :596-613
[7]   A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation [J].
Mayer, Nikolaus ;
Ilg, Eddy ;
Hausser, Philip ;
Fischer, Philipp ;
Cremers, Daniel ;
Dosovitskiy, Alexey ;
Brox, Thomas .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4040-4048
[8]  
Menze M, 2015, PROC CVPR IEEE, P3061, DOI 10.1109/CVPR.2015.7298925
[9]   Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching [J].
Pang, Jiahao ;
Sun, Wenxiu ;
Ren, Jimmy S. J. ;
Yang, Chengxi ;
Yan, Qiong .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :878-886
[10]  
Reza MA, 2018, IEEE INT C INT ROBOT, P4751, DOI 10.1109/IROS.2018.8593971