MPCFusion: Multi-scale parallel cross fusion for infrared and visible images via convolution and vision Transformer

被引：7

作者：

Tang, Haojie ^{[1
]}

Qian, Yao ^{[1
]}

Xing, Mengliang ^{[1
]}

Cao, Yisheng ^{[1
]}

Liu, Gang ^{[1
]}

机构：

[1] Shanghai Univ Elect Power, Sch Automat Engn, Shanghai 200090, Peoples R China

来源：

OPTICS AND LASERS IN ENGINEERING | 2024年 / 176卷

基金：

中国国家自然科学基金;

关键词：

Image fusion; Vision Transformer; Convolution; Multi-scale feature; Infrared; NETWORK;

D O I：

10.1016/j.optlaseng.2024.108094

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

The image fusion community is thriving with the wave of deep learning, and the most popular fusion methods are usually built upon well -designed network structures. However, most of the current methods do not fully exploit deeper features while ignore the importance of long-range dependencies. In this paper, a convolution and vision Transformer -based multi -scale parallel cross fusion network for infrared and visible images is proposed (MPCFusion). To exploit deeper texture details, a feature extraction module based on convolution and vision Transformer is designed. With a view to correlating the shallow features between different modalities, a parallel cross -attention module is proposed, in which a parallel -channel model efficiently preserves the proprietary modal features, followed by a cross -spatial model that ensures the information interactions between the different modalities. Moreover, a cross -domain attention module based on convolution and vision Transformer is proposed to capturing long-range dependencies between in-depth features and effectively solves the problem of global context loss. Finally, a nest -connection based decoder is used for implementing feature reconstruction. In particular, we design a new texture -guided structural similarity loss function to drive the network to preserve more complete texture details. Extensive experimental results illustrate that MPCFusion shows excellent fusion performance and generalization capabilities. The source code will be released at https:// github .com /YQ -097 /MPCFusion.

引用

页数：13

共 50 条

[21] Infrared and visible image fusion using multi-scale pyramid network
Zuo, Fengyuan
Huang, Yongdong
Li, Qiufu
Su, Weijian
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2022, 20 (05)
[22] MPCT: A medical image fusion method based on multi-scale pyramid convolution and Transformer
Xu, Yi
Wang, Zijie
Wu, Shoucai
Zhan, Xiongfei
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
[23] Infrared and Visible Image Fusion Using Multi-scale Decomposition and Partial Differential Equations
Trivedi G.
Sanghvi R.
International Journal of Applied and Computational Mathematics, 2024, 10 (4)
[24] AITFuse: Infrared and visible image fusion via adaptive interactive transformer learning
Wang, Zhishe
Yang, Fan
Sun, Jing
Xu, Jiawei
Yang, Fengbao
Yan, Xiaomei
KNOWLEDGE-BASED SYSTEMS, 2024, 299
[25] Multi-scale decomposition based fusion of infrared and visible image via total variation and saliency analysis
Ma, Tao
Ma, Jie
Fang, Bin
Hu, Fangyu
Quan, Siwen
Du, Huajun
INFRARED PHYSICS & TECHNOLOGY, 2018, 92 : 154 - 162
[26] Multi-scale vision transformer classification model with self-supervised learning and dilated convolution
Xing, Liping
Jin, Hongmei
Li, Hong-an
Li, Zhanli
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 103
[27] AEFusion: A multi-scale fusion network combining Axial attention and Entropy feature Aggregation for infrared and visible images
Li, Bicao
Lu, Jiaxi
Liu, Zhoufeng
Shao, Zhuhong
Li, Chunlei
Du, Yifan
Huang, Jie
APPLIED SOFT COMPUTING, 2023, 132
[28] HitFusion: Infrared and Visible Image Fusion for High-Level Vision Tasks Using Transformer
Chen, Jun
Ding, Jianfeng
Ma, Jiayi
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10145 - 10159
[29] Multi-scale infrared and visible image fusion framework based on dual partial differential equations
Guo, Chentong
Liu, Chenhua
Deng, Lei
Chen, Zhixiang
Dong, Mingli
Zhu, Lianqing
Chen, Hanrui
Lu, Xitian
INFRARED PHYSICS & TECHNOLOGY, 2023, 135
[30] Infrared and visible image fusion based on saliency detection and deep multi-scale orientational features
Liu, Gang
Jia, Menghan
Wang, Xiao
Bavirisetti, Durga
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)

← 1 2 3 4 5 →