Heterogeneous Feature Fusion Module Based on CNN and Transformer for Multiview Stereo Reconstruction

被引：4

作者：

Gao, Rui ^{[1
]}

Xu, Jiajia ^{[1
]}

Chen, Yipeng ^{[2
]}

Cho, Kyungeun ^{[1
]}

机构：

[1] Dongguk Univ Seoul, Dept Multimedia Engn, 30 Pildongro 1 Gil, Seoul 04620, South Korea

[2] Dongguk Univ Seoul, Dept Autonomous Things Intelligence, 30 Pildongro 1 Gil, Seoul 04620, South Korea

来源：

MATHEMATICS | 2023年 / 11卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

multi-view stereo; 3D reconstruction; deep learning; transformer;

D O I：

10.3390/math11010112

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

For decades, a vital area of computer vision research has been multiview stereo (MVS), which creates 3D models of a scene using photographs. This study presents an effective MVS network for 3D reconstruction utilizing multiview pictures. Alternative learning-based reconstruction techniques work well, because CNNs (convolutional neural network) can extract only the image's local features; however, they contain many artifacts. Herein, a transformer and CNN are used to extract the global and local features of the image, respectively. Additionally, hierarchical aggregation and heterogeneous interaction modules were used to improve these features. They are based on the transformer and can extract dense features with 3D consistency and global context that are necessary to provide accurate matching for MVS.

引用

页数：14

共 50 条

[1] Multiview Stereo Reconstruction with Feature Aggregation Transformer
Wang Min
Zhao Mingfu
Song Tao
Li Weiwei
Tian Yuan
Li Cheng
Zhang Yu
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (14)
[2] Spectral Reconstruction for Internet of Things Based on Parallel Fusion of CNN and Transformer
Sun, Bangyong
Wu, Changyu
Yu, Mengying
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (04): : 3549 - 3562
[3] Object Detection Algorithm Based on CNN-Transformer Dual Modal Feature Fusion
Yang Chen
Hou Zhiqiang
Li Xinyue
Ma Sugang
Yang Xiaobao
ACTA PHOTONICA SINICA, 2024, 53 (03)
[4] Hyperspectral Image Classification Based on Interactive Transformer and CNN With Multilevel Feature Fusion Network
Yang, Hao
Yu, Haoyang
Zheng, Ke
Hu, Jiaochan
Tao, Tingting
Zhang, Qiang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[5] Transformer-based multiview spatiotemporal feature interactive fusion for human action recognition in depth videos
Wu, Hanbo
Ma, Xin
Li, Yibin
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 131
[6] Multiview CNN Model for Sensor Fusion Based Vehicle Detection
Ouyang, Zhenchao
Wang, Chunyuan
Liu, Yu
Niu, Jianwei
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 459 - 470
[7] Multiview Scene Reconstruction Based on Edge Assisted Epipolar Transformer
Tong W.
Zhang M.
Li D.
Wu Q.
Song A.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2023, 45 (10): : 3483 - 3491
[8] Multiview stereo reconstruction of UAV remote sensing images based on adaptive propagation with multiregional refinement
Fu, Haohai
Nie, Zixuan
Pan, Xin
SCIENTIFIC REPORTS, 2025, 15 (01):
[9] BGCFormer: A Text Event Feature Fusion Learning Model based on Transformer
Liu, Yulong
Wang, Juan
Li, Qian
2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 157 - 161
[10] EEG classification algorithm of motor imagery based on CNN-Transformer fusion network
Liu, Haofeng
Liu, Yuefeng
Wang, Yue
Liu, Bo
Bao, Xiang
2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1302 - 1309

← 1 2 3 4 5 →