ResCCFusion: Infrared and visible image fusion network based on ResCC module and spatial criss-cross attention models

被引：5

作者：

Xiong, Zhang ^{[1
]}

Zhang, Xiaohui ^{[1
]}

Han, Hongwei ^{[1
]}

Hu, Qingping ^{[1
]}

机构：

[1] Naval Univ Engn, Dept Weap Engn, Wuhan 430030, Peoples R China

来源：

INFRARED PHYSICS & TECHNOLOGY | 2024年 / 136卷

关键词：

Image fusion; Auto-encoder; Residual network; Infrared image; Visible image; INFORMATION; FRAMEWORK; NEST;

D O I：

10.1016/j.infrared.2023.104962

中图分类号：

TH7 [仪器、仪表];

学科分类号：

0804 ; 080401 ; 081102 ;

摘要：

We proposed an infrared and visible image fusion method based on the ResCC module and spatial criss-cross attention models. The proposed method adopts an auto-encoder structure consisting of an encoder network, fusion layers, and a decoder network. The encoder network has a convolution layer and three ResCC blocks with dense connections. Each ResCC block can extract multi-scale features from source images without downsampling operations and retain as many feature details as possible for image fusion. The fusion layer adopts spatial criss-cross attention models, which can capture contextual information in both horizontal and vertical directions. Attention in these two directions can also reduce the calculation of the attention maps. The decoder network consists of four convolution layers designed to reconstruct images from the feature map. Experiments performed on the public datasets demonstrate that the proposed method obtains better fusion performance on objective and subjective evaluations compared to other advanced fusion methods. The code is available at https ://github.com/xiongzhangzzz/ResCCFusion.

引用

页数：10

共 48 条

[1] Bavirisetti DP, 2017, 2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), P701
[2] Chen T, 2005, INT GEOSCI REMOTE SE, P1150
[3] Background-subtraction using contour-based fusion of thermal and visible imagery
Davis, James W.
Sharma, Vinay
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 106 (2-3) : 162 - 182
[4] Sparse directional image representations using the discrete shearlet transform
Easley, Glenn
Labate, Demetrio
Lim, Wang-Q
[J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2008, 25 (01) : 25 - 46
[5] Res2Net: A New Multi-Scale Backbone Architecture
Gao, Shang-Hua
Cheng, Ming-Ming
Zhao, Kai
Zhang, Xin-Yu
Yang, Ming-Hsuan
Torr, Philip
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 652 - 662
[6] Haghighat M, 2014, I C APPL INF COMM TE, P424
[7] A new image fusion performance metric based on visual information fidelity
Han, Yu
Cai, Yunze
Cao, Yin
Xu, Xiaoming
[J]. INFORMATION FUSION, 2013, 14 (02) : 127 - 135
[8] VDFEFuse: A novel fusion approach to infrared and visible images
Hao, Shuai
He, Tian
An, Beiyi
Ma, Xu
Wen, Hu
Wang, Feng
[J]. INFRARED PHYSICS & TECHNOLOGY, 2022, 121
[9] ResDNet: Efficient Dense Multi-Scale Representations With Residual Learning for High-Level Vision Tasks
Hong, Yuanduo
Pan, Huihui
Jia, Yisong
Sun, Weichao
Gao, Huijun
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, : 1 - 12
[10] CCNet: Criss-Cross Attention for Semantic Segmentation
Huang, Zilong
Wang, Xinggang
Huang, Lichao
Huang, Chang
Wei, Yunchao
Liu, Wenyu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612

← 1 2 3 4 5 →