3D-C2FT: Coarse-to-Fine Transformer for Multi-view 3D Reconstruction

被引:4
|
作者
Tiong, Leslie Ching Ow [1 ]
Sigmund, Dick [2 ]
Teoh, Andrew Beng Jin [3 ]
机构
[1] Korea Inst Sci & Technol, Computat Sci Res Ctr, 5 Hwarang Ro 14 Gil, Seoul 02792, South Korea
[2] AIDOT Inc, 128 Beobwon Ro, Seoul 05854, South Korea
[3] Yonsei Univ, Sch Elect & Elect Engn, Seoul 120749, South Korea
来源
关键词
Multi-view 3D reconstruction; Coarse-to-fine transformer; Multi-scale attention;
D O I
10.1007/978-3-031-26319-4_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the transformer model has been successfully employed for the multi-view 3D reconstruction problem. However, challenges remain in designing an attention mechanism to explore the multi-view features and exploit their relations for reinforcing the encoding-decoding modules. This paper proposes a new model, namely 3D coarse-to-fine transformer (3D-C2FT), by introducing a novel coarse-to-fine (C2F) attention mechanism for encoding multi-view features and rectifying defective voxel-based 3D objects. C2F attention mechanism enables the model to learn multi-view information flow and synthesize 3D surface correction in a coarse to fine-grained manner. The proposed model is evaluated by ShapeNet and Multi-view Real-life voxel-based datasets. Experimental results show that 3D-C2FT achieves notable results and outperforms several competing models on these datasets.
引用
收藏
页码:211 / 227
页数:17
相关论文
共 50 条
  • [1] A Coarse-to-Fine Transformer-Based Network for 3D Reconstruction from Non-Overlapping Multi-View Images
    Shan, Yue
    Xiao, Jun
    Liu, Lupeng
    Wang, Yunbiao
    Yu, Dongbo
    Zhang, Wenniu
    REMOTE SENSING, 2024, 16 (05)
  • [2] C2FNet: A Coarse-to-Fine Network for Multi-View 3D Point Cloud Generation
    Lei, Jianjun
    Song, Jiahui
    Peng, Bo
    Li, Wanqing
    Pan, Zhaoqing
    Huang, Qingming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6707 - 6718
  • [3] MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction
    Jia, Ruiming
    Chen, Xin
    Cui, Jiali
    Hu, Zhenghui
    SENSORS, 2022, 22 (19)
  • [4] MULTI-VIEW 3D RECONSTRUCTION FROM VIDEO WITH TRANSFORMER
    Zhong, Yijie
    Sun, Zhengxing
    Sun, Yunhan
    Luo, Shoutong
    Wang, Yi
    Zhang, Wei
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1661 - 1665
  • [5] Cross-view Transformer for enhanced multi-view 3D reconstruction
    Shi, Wuzhen
    Yin, Aixue
    Li, Yingxiang
    Qian, Bo
    VISUAL COMPUTER, 2024,
  • [6] 3D Reconstruction for Multi-view Objects
    Yu, Jun
    Yin, Wenbin
    Hu, Zhiyi
    Liu, Yabin
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
  • [7] Multi-view 3D Reconstruction with Transformers
    Wang, Dan
    Cui, Xinrui
    Chen, Xun
    Zou, Zhengxia
    Shi, Tianyang
    Salcudean, Septimiu
    Wang, Z. Jane
    Ward, Rabab
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5702 - 5711
  • [8] Long-Range Grouping Transformer for Multi-View 3D Reconstruction
    Yang, Liying
    Zhu, Zhenwei
    Lin, Xuxin
    Nong, Jian
    Liang, Yanyan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18211 - 18221
  • [9] Multi-View Transformer for 3D Visual Grounding
    Huang, Shijia
    Chen, Yilun
    Jia, Jiaya
    Wang, Liwei
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15503 - 15512
  • [10] 3D Texture Mapping in Multi-view Reconstruction
    Chen, Zhaolin
    Zhou, Jun
    Chen, Yisong
    Wang, Guoping
    ADVANCES IN VISUAL COMPUTING, ISVC 2012, PT I, 2012, 7431 : 359 - 371