Geometry-Enhanced Attentive Multi-View Stereo for Challenging Matching Scenarios

被引：2

作者：

Liu, Yimei ^{[1
]}

Cai, Qing ^{[1
]}

Wang, Congcong ^{[2
]}

Yang, Jian ^{[1
]}

Fan, Hao ^{[1
]}

Dong, Junyu ^{[1
]}

Chen, Sheng ^{[1
,3
]}

机构：

[1] Ocean Univ China, Dept Informat Sci & Technol, Qingdao 266100, Peoples R China

[2] Tianjin Univ Technol, Dept Comp Sci & Engn, Tianjin 300222, Peoples R China

[3] Univ Southampton, Sch Elect & Comp Sci, Southampton SO17 1BJ, England

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 08期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Feature extraction; Estimation; Costs; Three-dimensional displays; Reliability; Pipelines; Loss measurement; Multi-view stereo; 3D reconstruction; depth estimation; geometric features; deep learning; NETWORK;

D O I：

10.1109/TCSVT.2024.3376692

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep networks have made remarkable progress in Multi-View Stereo (MVS) task in recent years. However, the problem of finding accurate correspondences across different views under ill-posed matching situations remains unresolved and crucial. To address this issue, this paper proposes a Geometry-enhanced Attentive Multi-View Stereo (GA-MVS) network, which can access multi-view consistent feature representation and achieve accurate depth estimation in challenging situations. Specifically, we propose a geometry-enhanced feature extractor to explore illumination-invariant geometric features and incorporate them with common texture features to improve matching accuracy when dealing with view-dependent photometric effects, such as shadow and specularity. Then, we design a novel attentive learning framework to explore per-pixel adaptive supervision, effectively improving the depth estimation performance of textureless regions. The experimental results on the DTU and Tanks & Temples benchmarks demonstrate that our method achieves state-of-the-art results compared to other advanced MVS models.

引用

页码：7401 / 7416

页数：16

共 61 条

[1]

[Anonymous], 2015, Open Multi -View Stereo Reconstruction Library

[2]

Bailer C, 2012, LECT NOTES COMPUT SC, V7574, P398, DOI 10.1007/978-3-642-33712-3_29

[3]

Bitelli G., 2018, Int. Arch. Photogramm., Remote Sens. Spatial Inf. Sci., V2, P97

[4] Reconstruction and Efficient Visualization of Heterogeneous 3D City Models [J].

Buyukdemircioglu, Mehmet ;

Kocaman, Sultan .

REMOTE SENSING, 2020, 12 (13)

[5] Pyramid Stereo Matching Network [J].

Chang, Jia-Ren ;

Chen, Yong-Sheng .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418

[6] Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness [J].

Cheng, Shuo ;

Xu, Zexiang ;

Zhu, Shilin ;

Li, Zhuwen ;

Li, Li Erran ;

Ramamoorthi, Ravi ;

Su, Hao .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2521-2531

[7] A Comprehensive Study of 3-D Vision-Based Robot Manipulation [J].

Cong, Yang ;

Chen, Ronghan ;

Ma, Bingtao ;

Liu, Hongsen ;

Hou, Dongdong ;

Yang, Chenguang .

IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) :1682-1698

[8] Adaptive Disparity Candidates Prediction Network for Efficient Real-Time Stereo Matching [J].

Dai, He ;

Zhang, Xuchong ;

Zhao, Yongli ;

Sun, Hongbin ;

Zheng, Nanning .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) :3099-3110

[9]

Ding Y., 2022, P IEEE CVF C COMP VI, P8594

[10] Seeing Through Darkness: Visual Localization at Night via Weakly Supervised Learning of Domain Invariant Features [J].

Fan, Bin ;

Yang, Yuzhu ;

Feng, Wensen ;

Wu, Fuchao ;

Lu, Jiwen ;

Liu, Hongmin .

IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :1713-1726

← 1 2 3 4 5 6 7 →