Multi-scale inputs and context-aware aggregation network for stereo matching

被引:1
|
作者
Shi, Liqing [1 ,2 ,3 ]
Xiong, Taiping [1 ,2 ]
Cui, Gengshen [2 ]
Pan, Minghua [2 ]
Cheng, Nuo [1 ,2 ]
Wu, Xiangjie [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Key Lab Image & Graph Intelligent Proc, Guilin 541004, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Nanning Res Inst, Nanning 530000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-scale feature fusion; Context-aware capability; 3D squeeze-and-excitation; Stereo matching; Binocular vision;
D O I
10.1007/s11042-024-18492-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the significant progress made in deep learning-based stereo matching, the accuracy of these methods significantly decreases when faced with challenges such as occlusions, reflections, textureless areas, and scale variations. In this paper, we propose MSCANet, a novel stereo matching network that integrates multi-scale inputs and context-aware aggregation ability. MSCANet effectively integrates rich multi-scale feature information and exhibits context-aware capability, thereby enabling it to achieve superior performance. Firstly, a multi-scale aware fusion module is designed to efficiently incorporate more comprehensive global context features at different scales, which allows the model to enhance its ability to generalize across images of varying scales. Secondly, a novel V-shaped encoder/decoder module is developed to effectively exploit the rich feature information. In the encoding stage, a 3D squeeze-and-excitation block is introduced to facilitate adaptively recalibration of learned feature maps. This block effectively suppresses irrelevant features while enhancing useful features, which improved efficiency and accuracy in disparity prediction. Additionally, a 3D context-aware decode block is designed to effectively utilize global context features to restore the original image structure during the decoding stage. Moreover, the high-level feature maps can be employed to augment low-level feature maps by incorporating more detailed information to avoid the side effects caused by the loss of information during the encoding process. Extensive ablation experiments and comparative experiments were conducted on Scene Flow dataset, KITTI2012 and KITTI2015 datasets to validate the effectiveness of each proposed module. The experimental results demonstrate MSCANet achieves competitive performance and offers a more straightforward and efficient model design, as well as faster inference speed.
引用
收藏
页码:75171 / 75194
页数:24
相关论文
共 50 条
  • [21] Multi-scale graph neural network for global stereo matching
    Wang, Xiaofeng
    Yu, Jun
    Sun, Zhiheng
    Sun, Jiameng
    Su, Yingying
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 118
  • [22] Deep Photometric Stereo Network with Multi-Scale Feature Aggregation
    Yu, Chanki
    Lee, Sang Wook
    SENSORS, 2020, 20 (21) : 1 - 14
  • [23] CDMC-Net: Context-Aware Image Deblurring Using a Multi-scale Cascaded Network
    Zhao, Qian
    Zhou, Dongming
    Yang, Hao
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 3985 - 4006
  • [24] CDMC-Net: Context-Aware Image Deblurring Using a Multi-scale Cascaded Network
    Qian Zhao
    Dongming Zhou
    Hao Yang
    Neural Processing Letters, 2023, 55 : 3985 - 4006
  • [25] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers
    Ding, Yikang
    Yuan, Wentao
    Zhu, Qingtian
    Zhang, Haotian
    Liu, Xiangyue
    Wang, Yuanjiang
    Liu, Xiao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8575 - 8584
  • [26] Context-Aware Interaction Network for Question Matching
    Hu, Zhe
    Fu, Zuohui
    Yin, Yu
    de Melo, Gerard
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3846 - 3853
  • [27] Multi-scale attention context-aware network for detection and localization of image splicing Efficient and robust identification network
    Ren, Ruyong
    Niu, Shaozhang
    Jin, Junfeng
    Zhang, Jiwei
    Ren, Hua
    Zhao, Xiaojie
    APPLIED INTELLIGENCE, 2023, 53 (15) : 18219 - 18238
  • [28] Multi-scale Cross-form Pyramid Network for Stereo Matching
    Zhu, Zhidong
    He, Mingyi
    Dai, Yuchao
    Rao, Zhibo
    Li, Bo
    PROCEEDINGS OF THE 2019 14TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2019), 2019, : 1789 - 1794
  • [29] Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network
    Liu, Zhenguo
    Li, Zhao
    Ao, Wengang
    Zhang, Shaoshuang
    Liu, Wenlong
    He, Yizhi
    ELECTRONICS, 2023, 12 (07)
  • [30] MAGE: Multi-scale Context-aware Interaction based on Multi-granularity Embedding for Chinese Medical Question Answer Matching
    Wang, Meiling
    He, Xiaohai
    Liu, Yan
    Qing, Linbo
    Zhang, Zhao
    Chen, Honggang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 228