ResCCFusion: Infrared and visible image fusion network based on ResCC module and spatial criss-cross attention models

被引:5
作者
Xiong, Zhang [1 ]
Zhang, Xiaohui [1 ]
Han, Hongwei [1 ]
Hu, Qingping [1 ]
机构
[1] Naval Univ Engn, Dept Weap Engn, Wuhan 430030, Peoples R China
关键词
Image fusion; Auto-encoder; Residual network; Infrared image; Visible image; INFORMATION; FRAMEWORK; NEST;
D O I
10.1016/j.infrared.2023.104962
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
We proposed an infrared and visible image fusion method based on the ResCC module and spatial criss-cross attention models. The proposed method adopts an auto-encoder structure consisting of an encoder network, fusion layers, and a decoder network. The encoder network has a convolution layer and three ResCC blocks with dense connections. Each ResCC block can extract multi-scale features from source images without downsampling operations and retain as many feature details as possible for image fusion. The fusion layer adopts spatial criss-cross attention models, which can capture contextual information in both horizontal and vertical directions. Attention in these two directions can also reduce the calculation of the attention maps. The decoder network consists of four convolution layers designed to reconstruct images from the feature map. Experiments performed on the public datasets demonstrate that the proposed method obtains better fusion performance on objective and subjective evaluations compared to other advanced fusion methods. The code is available at https ://github.com/xiongzhangzzz/ResCCFusion.
引用
收藏
页数:10
相关论文
共 48 条
  • [1] Bavirisetti DP, 2017, 2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), P701
  • [2] Chen T, 2005, INT GEOSCI REMOTE SE, P1150
  • [3] Background-subtraction using contour-based fusion of thermal and visible imagery
    Davis, James W.
    Sharma, Vinay
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 106 (2-3) : 162 - 182
  • [4] Sparse directional image representations using the discrete shearlet transform
    Easley, Glenn
    Labate, Demetrio
    Lim, Wang-Q
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2008, 25 (01) : 25 - 46
  • [5] Res2Net: A New Multi-Scale Backbone Architecture
    Gao, Shang-Hua
    Cheng, Ming-Ming
    Zhao, Kai
    Zhang, Xin-Yu
    Yang, Ming-Hsuan
    Torr, Philip
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 652 - 662
  • [6] Haghighat M, 2014, I C APPL INF COMM TE, P424
  • [7] A new image fusion performance metric based on visual information fidelity
    Han, Yu
    Cai, Yunze
    Cao, Yin
    Xu, Xiaoming
    [J]. INFORMATION FUSION, 2013, 14 (02) : 127 - 135
  • [8] VDFEFuse: A novel fusion approach to infrared and visible images
    Hao, Shuai
    He, Tian
    An, Beiyi
    Ma, Xu
    Wen, Hu
    Wang, Feng
    [J]. INFRARED PHYSICS & TECHNOLOGY, 2022, 121
  • [9] ResDNet: Efficient Dense Multi-Scale Representations With Residual Learning for High-Level Vision Tasks
    Hong, Yuanduo
    Pan, Huihui
    Jia, Yisong
    Sun, Weichao
    Gao, Huijun
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, : 1 - 12
  • [10] CCNet: Criss-Cross Attention for Semantic Segmentation
    Huang, Zilong
    Wang, Xinggang
    Huang, Lichao
    Huang, Chang
    Wei, Yunchao
    Liu, Wenyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612