Dynamic Grouping With Multi-Manifold Attention for Multi-View 3D Object Reconstruction

被引：0

作者：

Kalitsios, Georgios ^{[1
]}

Konstantinidis, Dimitrios ^{[1
]}

Daras, Petros ^{[1
]}

Dimitropoulos, Kosmas ^{[1
]}

机构：

[1] Ctr Res & Technol Hellas CERTH, Informat Technol Inst, Thessaloniki 57001, Greece

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Three-dimensional displays; Image reconstruction; Transformers; Solid modeling; Computational modeling; Vectors; Surface reconstruction; Object recognition; Computer vision; Training; Dynamic grouping; multi-manifold attention; multi-view 3D reconstruction; transformer; voxel representation;

D O I：

10.1109/ACCESS.2024.3483434

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In a multi-view 3D reconstruction problem, the task is to infer the 3D shape of an object from various images taken from different viewpoints. Transformer-based networks have demonstrated their ability to achieve high performance in such problems, but they face challenges in identifying the optimal way to merge the different views in order to estimate with great fidelity the 3D shape of the object. This work aims to address this issue by proposing a novel approach to compute information-rich inter-view features by combining image tokens with similar distinctive characteristics among the different views dynamically. This is achieved by leveraging the self-attention mechanism of a Transformer, enhanced with a multi-manifold attention module, to estimate the importance of image tokens on-the-fly and re-arrange them among the different views in a way that improves the viewpoint merging procedure and the 3D reconstruction results. Experiments on ShapeNet and Pix3D validate the ability of the proposed method to achieve state-of-the-art performance in both multi-view and single-view 3D object reconstruction.

引用

页码：160690 / 160699

页数：10

共 38 条

[1] Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction [J].

Cheng, Jian ;

Leng, Cong ;

Wu, Jiaxiang ;

Cui, Hainan ;

Lu, Hanqing .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1-8

[2] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[3]

Dosovitskiy A., 2021, INT C LEARNING REPRE, DOI DOI 10.48550/ARXIV.2010.11929

[4] Research on Multi-View 3D Reconstruction Technology Based on SFM [J].

Gao, Lei ;

Zhao, Yingbao ;

Han, Jingchang ;

Liu, Huixian .

SENSORS, 2022, 22 (12)

[5] Image-Based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era [J].

Han, Xian-Feng ;

Laga, Hamid ;

Bennamoun, Mohammed .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1578-1604

[6]

Hu X, 2018, P BMVC, P230

[7] GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction [J].

Jiang, Li ;

Shi, Shaoshuai ;

Qi, Xiaojuan ;

Jia, Jiaya .

COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 :820-834

[8]

Kar A., 2017, P ADV NEUR INF PROC, V30, P1

[9] Multi-Manifold Attention for Vision Transformers [J].

Konstantinidis, Dimitrios ;

Papastratis, Ilias ;

Dimitropoulos, Kosmas ;

Daras, Petros .

IEEE ACCESS, 2023, 11 :123433-123444

[10]

Liu R, 2021, Arxiv, DOI arXiv:2104.06637

← 1 2 3 4 →