Multi-scale inputs and context-aware aggregation network for stereo matching

被引:1
|
作者
Shi, Liqing [1 ,2 ,3 ]
Xiong, Taiping [1 ,2 ]
Cui, Gengshen [2 ]
Pan, Minghua [2 ]
Cheng, Nuo [1 ,2 ]
Wu, Xiangjie [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Key Lab Image & Graph Intelligent Proc, Guilin 541004, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Nanning Res Inst, Nanning 530000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-scale feature fusion; Context-aware capability; 3D squeeze-and-excitation; Stereo matching; Binocular vision;
D O I
10.1007/s11042-024-18492-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the significant progress made in deep learning-based stereo matching, the accuracy of these methods significantly decreases when faced with challenges such as occlusions, reflections, textureless areas, and scale variations. In this paper, we propose MSCANet, a novel stereo matching network that integrates multi-scale inputs and context-aware aggregation ability. MSCANet effectively integrates rich multi-scale feature information and exhibits context-aware capability, thereby enabling it to achieve superior performance. Firstly, a multi-scale aware fusion module is designed to efficiently incorporate more comprehensive global context features at different scales, which allows the model to enhance its ability to generalize across images of varying scales. Secondly, a novel V-shaped encoder/decoder module is developed to effectively exploit the rich feature information. In the encoding stage, a 3D squeeze-and-excitation block is introduced to facilitate adaptively recalibration of learned feature maps. This block effectively suppresses irrelevant features while enhancing useful features, which improved efficiency and accuracy in disparity prediction. Additionally, a 3D context-aware decode block is designed to effectively utilize global context features to restore the original image structure during the decoding stage. Moreover, the high-level feature maps can be employed to augment low-level feature maps by incorporating more detailed information to avoid the side effects caused by the loss of information during the encoding process. Extensive ablation experiments and comparative experiments were conducted on Scene Flow dataset, KITTI2012 and KITTI2015 datasets to validate the effectiveness of each proposed module. The experimental results demonstrate MSCANet achieves competitive performance and offers a more straightforward and efficient model design, as well as faster inference speed.
引用
收藏
页码:75171 / 75194
页数:24
相关论文
共 50 条
  • [11] Progressive Context-Aware Aggregation Network Combining Multi-Scale and Multi-Level Dense Reconstruction for Building Change Detection
    Xu, Chuan
    Ye, Zhaoyi
    Mei, Liye
    Yang, Wei
    Hou, Yingying
    Shen, Sen
    Ouyang, Wei
    Ye, Zhiwei
    REMOTE SENSING, 2023, 15 (08)
  • [12] Multi-Scale Dense Attention Network for Stereo Matching
    Chang, Yuhui
    Xu, Jiangtao
    Gao, Zhiyuan
    ELECTRONICS, 2020, 9 (11) : 1 - 12
  • [13] Global context-aware multi-scale features aggregative network for salient object detection
    Ullah, Inam
    Jian, Muwei
    Hussain, Sumaira
    Lian, Li
    Ali, Zafar
    Qureshi, Imran
    Guo, Jie
    Yin, Yilong
    NEUROCOMPUTING, 2021, 455 : 139 - 153
  • [14] Bridging Multi-Scale Context-Aware Representation for Object Detection
    Wang, Boying
    Ji, Ruyi
    Zhang, Libo
    Wu, Yanjun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2317 - 2329
  • [15] Multi-Scale Based Context-Aware Net for Action Detection
    Liu, Haijun
    Wang, Shiguang
    Wang, Wen
    Cheng, Jian
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (02) : 337 - 348
  • [16] CONTEXT-AWARE EVENT-DRIVEN STEREO MATCHING
    Zou, Dongqing
    Guo, Ping
    Wang, Qiang
    Wang, Xiaotao
    Shao, Guangqi
    Shi, Feng
    Li, Jia
    Park, Paul-K. J.
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 1076 - 1080
  • [17] Joint Bilateral Filter and Multi-Scale Cost Aggregation in Stereo Matching
    Ye You-ping
    Zheng Hong
    Chen Hao
    Yang Yu
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 2945 - 2948
  • [18] Adaptive Multi-scale Cost Volume Construction and Aggregation for Stereo Matching
    Pang Y.-W.
    Su C.
    Long T.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2023, 44 (04): : 457 - 468
  • [19] Multi-scale attention context-aware network for detection and localization of image splicingEfficient and robust identification network
    Ruyong Ren
    Shaozhang Niu
    Junfeng Jin
    Jiwei Zhang
    Hua Ren
    Xiaojie Zhao
    Applied Intelligence, 2023, 53 : 18219 - 18238
  • [20] Multi-Scale Cost Volumes Cascade Network for Stereo Matching
    Jia, Xiaogang
    Chen, Wei
    Liang, Chen Li Zhengfa
    Wu, Mingfei
    Tan, Yusong
    Huang, Libo
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 8657 - 8663