GTMFuse: Group-attention transformer-driven multiscale dense feature-enhanced network for infrared and visible image fusion

被引:16
|
作者
Mei, Liye [1 ,2 ]
Hu, Xinglong [1 ]
Ye, Zhaoyi [1 ]
Tang, Linfeng [3 ]
Wang, Ying [4 ]
Li, Di [1 ]
Liu, Yan [4 ]
Hao, Xin [5 ]
Lei, Cheng [2 ]
Xu, Chuan [1 ]
Yang, Wei [4 ,6 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China
[2] Wuhan Univ, Inst Technol Sci, Wuhan 430072, Peoples R China
[3] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
[4] Wuchang Shouyi Univ, Sch Informat Sci & Engn, Wuhan 430064, Peoples R China
[5] Antgroup, Hangzhou 310020, Peoples R China
[6] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & Re, Wuhan 430072, Peoples R China
关键词
Infrared and visible image fusion; Deep learning; Group; -attention; Multiscale feature;
D O I
10.1016/j.knosys.2024.111658
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Infrared and visible images captured by different devices can be seamlessly integrated into a single composite image through the application of image fusion techniques. However, many existing convolutional neural network-based methods for infrared and visible image fusion have exhibited limited capability for effectively amalgamating information from the source images. Consequently, we propose a group-attention transformer into the multiscale feature enhanced network for infrared and visible image fusion, which we abbreviate as GTMFuse. Specifically, GTMFuse employs multiscale dual-channel encoders to independently process the source image and extract multiscale features. Among the encoders, the group-attention transformer module is utilized to facilitate more comprehensive long-range feature dependency modeling at each scale. This innovative module seamlessly combines a fixed-direction stripe attention mechanism with channel attention and window attention, enabling comprehensive global long-range information capture and interaction with feature information across the source images. The multiscale features obtained from the group-attention transformer module are integrated into the fused image through a meticulously designed dense fusion block. Furthermore, this study introduces a novel dataset named HBUT-IV, encompassing surveillance images captured from multiple viewpoints. The HBUT-IV dataset serves as a valuable benchmark for assessing the efficacy of fusion methods. Extensive experiments are conducted on four datasets employing nine comparative methods, revealing the superior performance of the GTMFuse approach. The implementation code is accessible at https://github.com/XingLongH/GTMFuse.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Multiscale feature pyramid network based on activity level weight selection for infrared and visible image fusion
    Xu, Rui
    Liu, Gang
    Xie, Yuning
    Prasad, Bavirisetti Durga
    Qian, Yao
    Xing, Mengliang
    Journal of the Optical Society of America A: Optics and Image Science, and Vision, 2022, 39 (12): : 2193 - 2204
  • [32] Multiscale feature pyramid network based on activity level weight selection for infrared and visible image fusion
    Xu, Rui
    Liu, Gang
    Xie, Yuning
    Prasad, Bavirisetti Durga
    Qian, Yao
    XIng, Mengliang
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2022, 39 (12) : 2193 - 2204
  • [33] Unsupervised densely attention network for infrared and visible image fusion
    Li, Yang
    Wang, Jixiao
    Miao, Zhuang
    Wang, Jiabao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (45-46) : 34685 - 34696
  • [34] Unsupervised densely attention network for infrared and visible image fusion
    Yang Li
    Jixiao Wang
    Zhuang Miao
    Jiabao Wang
    Multimedia Tools and Applications, 2020, 79 : 34685 - 34696
  • [35] HATF: Multi-Modal Feature Learning for Infrared and Visible Image Fusion via Hybrid Attention Transformer
    Liu, Xiangzeng
    Wang, Ziyao
    Gao, Haojie
    Li, Xiang
    Wang, Lei
    Miao, Qiguang
    REMOTE SENSING, 2024, 16 (05)
  • [36] RDCa-Net: Residual dense channel attention symmetric network for infrared and visible image fusion
    Huang, Zuyan
    Yang, Bin
    Liu, Chang
    INFRARED PHYSICS & TECHNOLOGY, 2023, 130
  • [37] GRDATFusion: A gradient residual dense and attention transformer infrared and visible image fusion network for smart city security systems in cloud and fog computing
    Zheng, Jian
    Jeon, Seunggil
    Yang, Xiaomin
    EXPERT SYSTEMS, 2025, 42 (02)
  • [38] SEGMENTATION-DRIVEN INFRARED AND VISIBLE IMAGE FUSION VIA TRANSFORMER-ENHANCED ARCHITECTURE SEARCHING
    Fu, Hongming
    Wu, Guanyao
    Liu, Zhu
    Yan, Tiantian
    Liu, Jinyuan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 4230 - 4234
  • [39] SCGAFusion: A skip-connecting group convolutional attention network for infrared and visible image fusion
    Zhu, Danchen
    Ma, Jingbin
    Li, Dong
    Wang, Xiaoming
    APPLIED SOFT COMPUTING, 2024, 163
  • [40] RITFusion: Reinforced Interactive Transformer Network for Infrared and Visible Image Fusion
    Li, Xiaoling
    Li, Yanfeng
    Chen, Houjin
    Peng, Yahui
    Chen, Luyifu
    Wang, Minjun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 16