IFA-Net: Isomerous Feature-aware Network for Single-view 3D Reconstruction

被引：0

作者：

Zhang, Zecheng ^{[1
]}

Han, Xianfeng ^{[1
]}

Xiao, Guoqian ^{[1
]}

机构：

[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

3D Reconstruction; Single-view; Vision transformer; Convolutional neural networks; Feature-aware;

D O I：

10.1109/IJCNN54540.2023.10191001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Single-view 3D reconstruction has long been an intractable and fundamental problem in computer vision. Objects with complex topological structures are difficult to be accurately reconstructed, which makes the existing methods suffer from blurred shape boundaries between multiple components in the object. Recently, convolutional neural network and vision transformer have begun to appear in the field of 3D reconstruction and have been widely used with excellent performance. However, the existing transformer-based methods mainly focus on the global long-term context dependency, and ignore the local details of the part space features, resulting in poor reconstruction of the detail part. In this paper, we propose a novel dual-branch network architecture, called IFA-Net, to capture local spatial perception information and retain global structural features for singleview 3D reconstruction. In addition, we propose an isomerous feature-aware module, which enables the dynamic fusion of different resolution features under the two branches. Thus, high-fidelity and detail-rich 3D object reconstruction can be achieved. Extensive experimental results demonstrate that our method is able to produce high-quality voxels, particularly with diverse topologies, as compared with the state-of-the-art methods.

引用

页数：8

共 27 条

[21] Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images [J].

Wang, Nanyang ;

Zhang, Yinda ;

Li, Zhuwen ;

Fu, Yanwei ;

Liu, Wei ;

Jiang, Yu-Gang .

COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 :55-71

[22]

Wu J, 2017, ADV NEUR IN, V30

[23]

Wu ZR, 2015, PROC CVPR IEEE, P1912, DOI 10.1109/CVPR.2015.7298801

[24]

Xiao T., 2021, Advances in Neural Information Processing Systems, V34, P30392

[25] Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images [J].

Xie, Haozhe ;

Yao, Hongxun ;

Zhang, Shengping ;

Zhou, Shangchen ;

Sun, Wenxiu .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (12) :2919-2935

[26] Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction [J].

Yang, Bo ;

Wang, Sen ;

Markham, Andrew ;

Trigoni, Niki .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (01) :53-73

[27]

Zai S., 2021, BMVC, P405

← 1 2 3 →