DGLT-Fusion: A decoupled global-local infrared and visible image fusion transformer

被引:14
作者
Yang, Xin [1 ]
Huo, Hongtao [1 ]
Wang, Renhua [1 ]
Li, Chang [2 ]
Liu, Xiaowen [1 ]
Li, Jing [3 ]
机构
[1] Peoples Publ Secur Univ China, Sch Informat Technol & Cyber Secur, Beijing 100038, Peoples R China
[2] Hefei Univ Technol, Dept Biomed Engn, Hefei 230009, Peoples R China
[3] Cent Univ Finance & Econ, Sch informat, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Infrared image; Visible image; Transformer; Convolution neural networks; Image fusion; NETWORK;
D O I
10.1016/j.infrared.2022.104522
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
Convolution Neural Networks (CNN) and generative adversarial networks (GAN) based approaches have achieved substantial performance in image fusion field. However, these methods focus on extracting local features and pay little attention to learning global dependencies. In recent years, given the competitive long-term dependency modeling capability, the Transformer based fusion method has made impressive achievement, but this method simultaneously processes long-term correspondences and short-term features, which might result in deficiently global-local information interaction. Towards this end, we propose a decoupled global- local infrared and visible image fusion Transformer (DGLT-Fusion). The DGLT-Fusion decouples global-local information learning into Transformer module and CNN module. The long-term dependencies are modeled by a series of Transformer blocks (global-decoupled Transformer blocks), while the short-term features are extracted by local-decoupled convolution blocks. In addition, we design Transformer dense connection to reserve more information. These two modules are interweavingly stacked that enables our network retain texture and detailed information more integrally. Furthermore, the comparative experiment results show that DGLT-Fusion achieves better performance than state-of-the-art approaches.
引用
收藏
页数:14
相关论文
共 45 条
  • [1] [Anonymous], 2020, ARXIV
  • [2] Chenfei Wang, 2020, 2020 International Conference on Robots & Intelligent System (ICRIS), P80, DOI 10.1109/ICRIS52159.2020.00028
  • [3] Geng X., 2001, arXiv
  • [4] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
  • [5] A Survey on Vision Transformer
    Han, Kai
    Wang, Yunhe
    Chen, Hanting
    Chen, Xinghao
    Guo, Jianyuan
    Liu, Zhenhua
    Tang, Yehui
    Xiao, An
    Xu, Chunjing
    Xu, Yixing
    Yang, Zhaohui
    Zhang, Yiman
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
  • [6] Hu Y., 2021, P 4 INT C PATT REC A, P46
  • [7] A survey of infrared and visual image fusion methods
    Jin, Xin
    Jiang, Qian
    Yao, Shaowen
    Zhou, Dongming
    Nie, Rencan
    Hai, Jinjin
    He, Kangjian
    [J]. INFRARED PHYSICS & TECHNOLOGY, 2017, 85 : 478 - 501
  • [8] Transformers in Vision: A Survey
    Khan, Salman
    Naseer, Muzammal
    Hayat, Munawar
    Zamir, Syed Waqas
    Khan, Fahad Shahbaz
    Shah, Mubarak
    [J]. ACM COMPUTING SURVEYS, 2022, 54 (10S)
  • [9] Le ZL, 2020, IEEE IMAGE PROC, P370, DOI 10.1109/ICIP40778.2020.9191089
  • [10] Pixel- and region-based image fusion with complex wavelets
    Lewis, John J.
    O'Callaghan, Robert J.
    Nikolov, Stavri G.
    Bull, David R.
    Canagarajah, Nishan
    [J]. INFORMATION FUSION, 2007, 8 (02) : 119 - 130