SA-MVSNet: Self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture

被引:4
|
作者
Yang, Ronghao [1 ]
Miao, Wang [1 ]
Zhang, Zhenxin [2 ,3 ]
Liu, Zhenlong [1 ]
Li, Mubai [2 ,3 ]
Lin, Bin [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Sichuan, Peoples R China
[2] Capital Normal Univ, Key Lab 3D Informat Acquisit & Applicat, MOE, Beijing 100048, Peoples R China
[3] Capital Normal Univ, Coll Resource Environm & Tourism, Beijing 100048, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Multi-view stereo; Depth estimation; Self-attention; Transformer; Weak texture; Adaptive propagation;
D O I
10.1016/j.engappai.2023.107800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-view stereo (MVS) reconstruction is a key task of image-based 3D reconstruction, and deep learning-based methods can achieve better results than traditional algorithms. However, most of the current deep learning-based MVS methods use convolutional neural networks (CNNs) to extract image features, which cannot achieve the aggregation of long-distance context information and capture robust global information. In addition, in the process of fusing depth maps into point clouds, the confidence filters will filter out the depth values with low confidence in weak texture areas. These problems will lead to the low completeness of 3D reconstruction of weak texture and texture-less areas. To address the above problems, this paper proposes SA-MVSNet based on the PatchmatchNet with a self-attentive mechanism. First, we design a coarse-to-fine network framework to advance depth map estimation. In the feature extraction network, a module with a pyramid structure based on Swin Transformer Block is used to replace the original Feature Pyramid Network (FPN), and the self-correlation between weak texture areas is enhanced by applying a global self-attention mechanism. Then, we also propose a self-attention-based adaptive propagation module (SA-AP), which applies a self-attention calculation within depth value propagation window to obtain the relative weight values of current pixel and others, and then adaptively samples the depth values of neighbors on the same surface for propagation. Experiments show that SA-MVSNet has significantly improved the completeness of 3D reconstruction for the images with weak texture on DTU (provided by Danish Technical University), BlendedMVS, and Tanks and Temple datasets. Our code is available at https://github.com/miaowang525/SA-MVSNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Multi-view stereoscopic attention network for 3D tumor classification in automated breast ultrasound
    Ding, Wanli
    Zhang, Heye
    Zhuang, Shuxin
    Zhuang, Zhemin
    Gao, Zhifan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [42] Research on Multi-View Stereo 3D Reconstruction in Virtual Reality System of Silk Road Cultural Inheritance
    Li Z.-X.
    Jiang H.
    Liu Y.-Q.
    Wang Z.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (03): : 500 - 512
  • [43] AIFD Based 2D Image Registration to Multi-View Stereo Mapped 3D Models
    Zhao, Biao
    NEURAL PROCESSING LETTERS, 2018, 48 (03) : 1261 - 1279
  • [44] Prior-Guided Multi-View 3D Head Reconstruction
    Wang, Xueying
    Guo, Yudong
    Yang, Zhongqi
    Zhang, Juyong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4028 - 4040
  • [45] AIFD Based 2D Image Registration to Multi-View Stereo Mapped 3D Models
    Biao Zhao
    Neural Processing Letters, 2018, 48 : 1261 - 1279
  • [46] SCA-PVNet: Self-and-cross attention based aggregation of point cloud and multi-view for 3D object retrieval
    Lin, Dongyun
    Cheng, Yi
    Guo, Aiyuan
    Mao, Shangbo
    Li, Yiqun
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [47] Composite pattern separation with CNN for multi-view structured light 3D reconstruction
    Zhang, Shulin
    Xiang, Sen
    Deng, Huiping
    Wu, Jin
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3139 - 3144
  • [48] EMVS: Event-Based Multi-View Stereo3D Reconstruction with an Event Camera in Real-Time
    Rebecq, Henri
    Gallego, Guillermo
    Mueggler, Elias
    Scaramuzza, Davide
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (12) : 1394 - 1414
  • [49] RECONSTRUCTING WHITE WALLS: MULTI-VIEW, MULTI-SHOT 3D RECONSTRUCTION OF TEXTURELESS SURFACES
    Ley, Andreas
    Haensch, Ronny
    Hellwich, Olaf
    XXIII ISPRS CONGRESS, COMMISSION III, 2016, 3 (03): : 91 - 98
  • [50] Segment Any Leaf 3D: A Zero-Shot 3D Leaf Instance Segmentation Method Based on Multi-View Images
    Wang, Yunlong
    Zhang, Zhiyong
    SENSORS, 2025, 25 (02)