Multi-view stereo network with point attention

被引:1
|
作者
Zhao, Rong [1 ]
Gu, Zhuoer [2 ]
Han, Xie [1 ]
He, Ligang [3 ]
Sun, Fusheng [1 ]
Jiao, Shichao [1 ]
机构
[1] North Univ China, Sch Comp Sci & Techol, Taiyuan, Peoples R China
[2] China Agr Univ, Natl Innovat Ctr Digital Fishery, Beijing, Peoples R China
[3] Univ Warwick, Dept Comp, Coventry, W Midlands, England
基金
国家重点研发计划;
关键词
Multi-view stereo; Deep learning; Attention mechanism; Depth map;
D O I
10.1007/s10489-023-04806-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, learning-based multi-view stereo (MVS) reconstruction has gained superiority when compared with traditional methods. In this paper, we introduce a novel point-attention network, with an attention mechanism, based on the point cloud structure. During the reconstruction process, our method with an attention mechanism can guide the network to pay more attention to complex areas such as thin structures and low-texture surfaces. We first infer a coarse depth map using a modified classical MVS deep framework and convert it into the corresponding point cloud. Then, we add the high-frequency features and different-resolution features of the raw images to the point cloud. Finally, our network guides the weight distribution of points in different dimensions through the attention mechanism and computes the depth displacement of each point iteratively as the depth residual, which is added to the coarse depth prediction to obtain the final high-resolution depth map. Experimental results show that our proposed point-attention architecture can achieve a significant improvement in some scenes without reasonable geometrical assumptions on the DTU dataset and the Tanks and Temples dataset, suggesting that our method has a strong generalization ability.
引用
收藏
页码:26622 / 26636
页数:15
相关论文
共 50 条
  • [1] Multi-view stereo network with point attention
    Rong Zhao
    Zhuoer Gu
    Xie Han
    Ligang He
    Fusheng Sun
    Shichao Jiao
    Applied Intelligence, 2023, 53 : 26622 - 26636
  • [2] Multi-view Stereo Network with Attention Thin Volume
    Wan, Zihang
    Xu, Chao
    Hu, Jing
    Xiao, Jian
    Meng, Zhaopeng
    Chen, Jitai
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 410 - 423
  • [3] Multi-view Stereo Network with Attention Thin Volume
    Wan, Zihang
    Xu, Chao
    Hu, Jing
    Xiao, Jian
    Meng, Zhaopeng
    Chen, Jitai
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13631 LNCS : 410 - 423
  • [4] Point-Based Multi-View Stereo Network
    Chen, Rui
    Han, Songfang
    Xu, Jing
    Su, Hao
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1538 - 1547
  • [5] Long-range Attention Network for Multi-View Stereo
    Zhang, Xudong
    Hu, Yutao
    Wang, Haochen
    Cao, Xianbin
    Zhang, Baochang
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3781 - 3790
  • [6] Attention-Aware Multi-View Stereo
    Luo, Keyang
    Guan, Tao
    Ju, Lili
    Wang, Yuesong
    Chen, Zhuo
    Luo, Yawei
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1587 - 1596
  • [7] Multi-View Stereo Network Based on Attention Mechanism and Neural Volume Rendering
    Zhu, Daixian
    Kong, Haoran
    Qiu, Qiang
    Ruan, Xiaoman
    Liu, Shulin
    ELECTRONICS, 2023, 12 (22)
  • [8] Multi-View Guided Multi-View Stereo
    Poggi, Matteo
    Conti, Andrea
    Mattoccia, Stefano
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8391 - 8398
  • [9] Visibility-Aware Point-Based Multi-View Stereo Network
    Chen, Rui
    Han, Songfang
    Xu, Jing
    Su, Hao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3695 - 3708
  • [10] DAR-MVSNet: a novel dual attention residual network for multi-view stereo
    Li, Tingshuai
    Liang, Hu
    Wen, Changchun
    Qu, Jiacheng
    Zhao, Shengrong
    Zhang, Qingmeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5857 - 5866