Transformer Based Conditional GAN for Multimodal Image Fusion

被引：32

作者：

Zhang, Jun ^{[1
]}

Jiao, Licheng ^{[1
]}

Ma, Wenping ^{[1
]}

Liu, Fang ^{[1
]}

Liu, Xu ^{[1
]}

Li, Lingling ^{[1
]}

Chen, Puhua ^{[1
]}

Yang, Shuyuan ^{[1
]}

机构：

[1] Xidian Univ, Joint Int Res Lab Intelligent Percept & Computat, Int Res Ctr Intelligent Percept & Computat,Minist, Sch Artificial Intelligence,Key Lab Intelligent Pe, Xian 710071, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2023年 / 25卷

关键词：

Image fusion; Generators; Training; Feature extraction; Transformers; Generative adversarial networks; Thermal sensors; Generative adversarial network; multimodal image fusion; transformer; GENERATIVE ADVERSARIAL NETWORK; MULTISCALE; FRAMEWORK; DEEP;

D O I：

10.1109/TMM.2023.3243659

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multimodal Image fusion is becoming urgent in multi-sensor information utilization. However, existing end-to-end image fusion frameworks ignore a priori knowledge integration and long-distance dependencies across domains, which brings challenges to the network convergence and global image perception in complex scenes. In this article, a conditional generative adversarial network with transformer (TCGAN) is proposed for multimodal image fusion. The generator is to generate a fused image with the source images content. The discriminators are adopted to distinguish the differences between the fused image and the source images. Adversarial training makes the final fused image to maintain the structural and textural details in the cross-modal images simultaneously. In particular, a wavelet fusion module makes the inputs contain image content from different domains as much as possible. The extracted convolutional features interact in the multiscale cross-modal transformer fusion module to fully complement the associated information. It makes the generator to focus on both local and global context. TCGAN fully considers the training efficiency of the adversarial process and the integrated retention of redundant information. Various experimental results of TCGAN have highlighted targets, rich details, and fast convergence properties on public datasets.

引用

页码：8988 / 9001

页数：14

共 54 条

[1] Wavelet based image fusion techniques - An introduction, review and comparison
Amolins, Krista
Zhang, Yun
Dare, Peter
[J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2007, 62 (04) : 249 - 263
[2] A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics
Azam, Muhammad Adeel
Khan, Khan Bahadar
Salahuddin, Sana
Rehman, Eid
Khan, Sajid Ali
Khan, Muhammad Attique
Kadry, Seifedine
Gandomi, Amir H.
[J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 144
[3] Multi-focus Image Fusion using Neutrosophic based Wavelet Transform
Bhat, Shiveta
Koundal, Deepika
[J]. APPLIED SOFT COMPUTING, 2021, 106
[4] A saliency-based multiscale approach for infrared and visible image fusion
Chen, Jun
Wu, Kangle
Cheng, Zhuo
Luo, Linbo
[J]. SIGNAL PROCESSING, 2021, 182
[5] ZeRGAN: Zero-Reference GAN for Fusion of Multispectral and Panchromatic Images
Diao, Wenxiu
Zhang, Feng
Sun, Jiande
Xing, Yinghui
Zhang, Kai
Bruzzone, Lorenzo
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8195 - 8209
[6] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[7] A Survey on Vision Transformer
Han, Kai
Wang, Yunhe
Chen, Hanting
Chen, Xinghao
Guo, Jianyuan
Liu, Zhenhua
Tang, Yehui
Xiao, An
Xu, Chunjing
Xu, Yixing
Yang, Zhaohui
Zhang, Yiman
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
[8] An Adaptive Fusion Algorithm for Visible and Infrared Videos Based on Entropy and the Cumulative Distribution of Gray Levels
Hu, Hai-Miao
Wu, Jiawei
Li, Bo
Guo, Qiang
Zheng, Jin
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (12) : 2706 - 2719
[9] An infrared and visible image fusion method based on multi-scale transformation and norm optimization
Li, Guofa
Lin, Yongjie
Qu, Xingda
[J]. INFORMATION FUSION, 2021, 71 : 109 - 129
[10] RFN-Nest: An end-to-end residual fusion network for infrared and visible images
Li, Hui
Wu, Xiao-Jun
Kittler, Josef
[J]. INFORMATION FUSION, 2021, 73 : 72 - 86

← 1 2 3 4 5 6 →