Guard-Net: Lightweight Stereo Matching Network via Global and Uncertainty-Aware Refinement for Autonomous Driving

被引：3

作者：

Liu, Yujun ^{[1
]}

Zhang, Xiangchen ^{[1
]}

Luo, Yang ^{[1
]}

Hao, Qiaoqiao ^{[1
]}

Su, Jinhe ^{[1
]}

Cai, Guorong ^{[1
]}

机构：

[1] Jimei Univ, Sch Comp Engn, Xiamen 361021, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Correlation; Autonomous vehicles; Costs; Uncertainty; Solid modeling; Transformers; Optimization; Stereo matching; global feature; disparity refinement; intelligent transportation; autonomous driving;

D O I：

10.1109/TITS.2024.3357841

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Stereo matching is a prominent research area in autonomous driving and computer vision. Despite significant progress made by learning-based methods, accurately predicting disparities in hazardous regions, which is crucial for ensuring safe vehicle operation, remains challenging. The limitations of methods based on Convolutional Neural Networks (CNNs) are most noticeable in textureless regions and repetitive patterns, leading to unreliable predictions. Furthermore, calculating disparities for boundaries and thin structures, where the disparity jump phenomenon is prominent remains difficult. To address these issues, we propose a lightweight stereo matching architecture that focuses on obtaining real-time and high-precision disparity maps in hazardous areas. We exploit an efficient global enhanced path to provide global representations in ill-posed regions, where CNN-based approaches often struggle. Second, our model integrates local and global features to generate more reliable cost volume. Finally, our innovative uncertainty-aware module refines disparity, making full use of high-frequency detailed information and uncertainty attention, effectively preserving complex structures. Comprehensive experimental studies on SceneFlow demonstrate our method outperforms state-of-the-art methods, achieving an End-Point Error (EPE) of 0.47 with only 3.60M parameters. The effectiveness of our method speed-accuracy trade-off is further confirmed by competitive results obtained from the KITTI 2012 and KITTI 2015 experiments. Code is available at: https://github.com/YJLCV/Guard-Net.

引用

页码：10260 / 10273

页数：14

共 65 条

[21] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[22] Fast R-CNN
Girshick, Ross
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
[23] Context-Enhanced Stereo Transformer
Guo, Weiyu
Li, Zhaoshuo
Yang, Yongkui
Wang, Zheng
Taylor, Russell H.
Unberath, Mathias
Yuille, Alan
Li, Yingwei
[J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 263 - 279
[24] Group-wise Correlation Stereo Network
Guo, Xiaoyang
Yang, Kai
Yang, Wukui
Wang, Xiaogang
Li, Hongsheng
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3268 - 3277
[25] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[26] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1904 - 1916
[27] Accurate and efficient stereo processing by semi-global matching and mutual information
Hirschmüller, H
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2005, : 807 - 814
[28] End-to-End Learning of Geometry and Context for Deep Stereo Regression
Kendall, Alex
Martirosyan, Hayk
Dasgupta, Saumitro
Henry, Peter
Kennedy, Ryan
Bachrach, Abraham
Bry, Adam
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 66 - 75
[29] StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
Khamis, Sameh
Fanello, Sean
Rhemann, Christoph
Kowdle, Adarsh
Valentin, Julien
Izadi, Shahram
[J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 596 - 613
[30] Boosting Monocular 3D Object Detection With Object-Centric Auxiliary Depth Supervision
Kim, Youngseok
Kim, Sanmin
Sim, Sangmin
Choi, Jun Won
Kum, Dongsuk
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (02) : 1801 - 1813

← 1 2 3 4 5 6 7 →