Disparity Refinement Based on Cross-Modal Feature Fusion and Global Hourglass Aggregation for Robust Stereo Matching

被引:0
作者
Wang, Gang [1 ]
Yang, Jinlong [1 ]
Wang, Yinghui [1 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI | 2025年 / 15036卷
关键词
Stereo Matching; Iterative Optimization; Disparity Refinement; Feature Fusion;
D O I
10.1007/978-981-97-8508-7_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stereo matching is a critical research area in computer vision. The advancement of deep learning has led to the gradual replacement of cost-filtering methods by iterative optimization techniques, characterized by outstanding generalization performance. However, cost volumes constructed solely through recurrent all-pairs field transforms in iterative optimization methods lack adequate image information, making it challenging to resolve blurring issues in pathological regions such as illumination changes or similar textures. In this paper, we propose SCA-Stereo, a disparity refinement network aimed at further optimizing the initial disparity map generated by iteration. First, we introduce a high- and low-frequency feature extractor to delve deeper into the structural and fine feature information inherent in the image. Furthermore, we propose a cross-modal feature fusion module to facilitate the exchange and integration of diverse features, expanding the receptive field to enhance information flow. Finally, we design a global hourglass aggregation network to efficiently capture non-local interactions between fusion features. Extensive experiments conducted across Scene Flow, KITTI, Middlebury, and ETH3D demonstrate the effectiveness of SCA-Stereo in achieving state-of-the-art stereo matching performance.
引用
收藏
页码:211 / 225
页数:15
相关论文
共 39 条
[1]   PatchMatch Stereo - Stereo Matching with Slanted Support Windows [J].
Bleyer, Michael ;
Rhemann, Christoph ;
Rother, Carsten .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[2]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[3]  
Chen Xin, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13808), P461, DOI 10.1007/978-3-031-25085-9_26
[4]   Coatrsnet: Fully Exploiting Convolution and Attention for Stereo Matching by Region Separation [J].
Cheng, Junda ;
Xu, Gangwei ;
Guo, Peng ;
Yang, Xin .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (01) :56-73
[5]  
Cho KYHY, 2014, Arxiv, DOI arXiv:1406.1078
[6]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[7]   Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation [J].
Gu, Jiaqi ;
Kwon, Hyoukjun ;
Wang, Dilin ;
Ye, Wei ;
Li, Meng ;
Chen, Yu-Hsin ;
Lai, Liangzhen ;
Chandra, Vikas ;
Pan, David Z. .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12084-12093
[8]   Group-wise Correlation Stereo Network [J].
Guo, Xiaoyang ;
Yang, Kai ;
Yang, Wukui ;
Wang, Xiaogang ;
Li, Hongsheng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3268-3277
[9]  
Hirschmüller H, 2008, IEEE T PATTERN ANAL, V30, P328, DOI [10.1109/TPAMI.2007.1166, 10.1109/TPAMl.2007.1166]
[10]   Fast Cost-Volume Filtering for Visual Correspondence and Beyond [J].
Hosni, Asmaa ;
Rhemann, Christoph ;
Bleyer, Michael ;
Rother, Carsten ;
Gelautz, Margrit .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (02) :504-511