Depth-Guided Aggregation for Real-Time Binocular Depth Estimation Network

被引：0

作者：

Fu, Dongxin ^{[1
]}

Zheng, Shaowu ^{[1
]}

Xie, Pengcheng ^{[1
]}

Li, Weihua ^{[1
]}

机构：

[1] South China Univ Technol, Guangzhou, Peoples R China

来源：

IEEE MULTIMEDIA | 2024年 / 31卷 / 02期

关键词：

Costs; Estimation; Feature extraction; Three-dimensional displays; Convolution; Real-time systems; Data mining; Cameras;

D O I：

10.1109/MMUL.2024.3395695

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Using binocular cameras to obtain depth information of target pixels offers a cost-effective and natural alternative to lidar systems. However, most of the current binocular depth estimation networks have difficulty achieving a better balance between speed and accuracy in real-world situations, and their prediction accuracy for long-range depth is often limited. In this article, we introduce the end-to-end real-time depth estimation network (RTDENet), which efficiently utilizes multiscale cost volumes for improved performance. We propose an efficient and flexible cost aggregation module that supplements residual information with high-resolution cost volumes. By replacing some computationally demanding 3-D convolutional layers with depth-guided excitation, we maintain accuracy while effectively controlling model computation. Alongside the distance-sensitive loss function, RTDENet achieves a global difference of 2.41 m and an inference time of 27 ms on the KITTI Stereo dataset. This balance of speed and accuracy outperforms other state-of-the-art algorithms in depth estimation tasks.

引用

页码：36 / 47

页数：12

共 20 条

[1] Pyramid Stereo Matching Network [J].

Chang, Jia-Ren ;

Chen, Yong-Sheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418

[2]

Cheng X., 2020, P ADV NEUR INF PROC, P169

[3]

Chuah W, 2020, Arxiv, DOI arXiv:2009.04629

[4] Group-wise Correlation Stereo Network [J].

Guo, Xiaoyang ;

Yang, Kai ;

Yang, Wukui ;

Wang, Xiaogang ;

Li, Hongsheng .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3268-3277

[5] End-to-End Learning of Geometry and Context for Deep Stereo Regression [J].

Kendall, Alex ;

Martirosyan, Hayk ;

Dasgupta, Saumitro ;

Henry, Peter ;

Kennedy, Ryan ;

Bachrach, Abraham ;

Bry, Adam .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :66-75

[6] StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction [J].

Khamis, Sameh ;

Fanello, Sean ;

Rhemann, Christoph ;

Kowdle, Adarsh ;

Valentin, Julien ;

Izadi, Shahram .

COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :596-613

[7] A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation [J].

Mayer, Nikolaus ;

Ilg, Eddy ;

Hausser, Philip ;

Fischer, Philipp ;

Cremers, Daniel ;

Dosovitskiy, Alexey ;

Brox, Thomas .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4040-4048

[8]

Menze M, 2015, PROC CVPR IEEE, P3061, DOI 10.1109/CVPR.2015.7298925

[9] Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching [J].

Pang, Jiahao ;

Sun, Wenxiu ;

Ren, Jimmy S. J. ;

Yang, Chengxi ;

Yan, Qiong .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :878-886

[10]

Reza MA, 2018, IEEE INT C INT ROBOT, P4751, DOI 10.1109/IROS.2018.8593971

← 1 2 →