A Coarse-to-Fine Transformer-Based Network for 3D Reconstruction from Non-Overlapping Multi-View Images

被引：2

作者：

Shan, Yue ^{[1
]}

Xiao, Jun ^{[1
]}

Liu, Lupeng ^{[1
]}

Wang, Yunbiao ^{[1
]}

Yu, Dongbo ^{[1
]}

Zhang, Wenniu ^{[1
]}

机构：

[1] Univ Chinese Acad & Sci, Sch Artificial Intelligence, 19 Yuquan Rd, Beijing 100049, Peoples R China

来源：

REMOTE SENSING | 2024年 / 16卷 / 05期

基金：

中国国家自然科学基金;

关键词：

point cloud reconstruction; Transformer; non-overlapping; multi-view; POINT CLOUD RECONSTRUCTION; SHAPE;

D O I：

10.3390/rs16050901

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Reconstructing 3D structures from non-overlapping multi-view images is a crucial task in the field of 3D computer vision, since it is difficult to establish feature correspondences and infer depth from overlapping parts of views. Previous methods, whether generating the surface mesh or volume of an object, face challenges in simultaneously ensuring the accuracy of detailed topology and the integrity of the overall structure. In this paper, we introduce a novel coarse-to-fine Transformer-based reconstruction network to generate precise point clouds from multiple input images at sparse and non-overlapping viewpoints. Specifically, we firstly employ a general point cloud generation architecture enhanced by the concept of adaptive centroid constraint for the coarse point cloud corresponding to the object. Subsequently, a Transformer-based refinement module applies deformation to each point. We design an attention-based encoder to encode both image projection features and point cloud geometric features, along with a decoder to calculate deformation residuals. Experiments on ShapeNet demonstrate that our proposed method outperforms other competing methods.

引用

页数：18

共 47 条

[1] Point-Based Multi-View Stereo Network [J].

Chen, Rui ;

Han, Songfang ;

Xu, Jing ;

Su, Hao .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1538-1547

[2]

Choi S, 2019, IEEE IMAGE PROC, P2379, DOI [10.1109/icip.2019.8803350, 10.1109/ICIP.2019.8803350]

[3] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[4] Numerical methods for shape-from-shadling: A new survey with benchmarks [J].

Durou, Jean-Denis ;

Falcone, Maurizio ;

Sagona, Manuela .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 109 (01) :22-43

[5] A Point Set Generation Network for 3D Object Reconstruction from a Single Image [J].

Fan, Haoqiang ;

Su, Hao ;

Guibas, Leonidas .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2463-2471

[6] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[7]

Huang YH, 2024, Arxiv, DOI arXiv:2304.02867

[8]

Insafutdinov E, 2018, ADV NEUR IN, V31

[9] MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction [J].

Jia, Ruiming ;

Chen, Xin ;

Cui, Jiali ;

Hu, Zhenghui .

SENSORS, 2022, 22 (19)

[10] DV-Net: Dual-view network for 3D reconstruction by fusing multiple sets of gated control point clouds [J].

Jia, Xin ;

Yang, Shourui ;

Peng, Yuxin ;

Zhang, Junchao ;

Chen, Shengyong .

PATTERN RECOGNITION LETTERS, 2020, 131 :376-382

← 1 2 3 4 5 →