AFDFusion: An adaptive frequency decoupling fusion network for multi-modality image

被引:2
|
作者
Wang, Chengchao [1 ]
Zhao, Zhengpeng [1 ]
Yang, Qiuxia [1 ]
Nie, Rencan [1 ]
Cao, Jinde [2 ,3 ]
Pu, Yuanyuan [1 ,4 ]
机构
[1] Yunnan Univ, Coll Informat Sci & Engn, Kunming 650500, Peoples R China
[2] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[3] Ahlia Univ, Manama, Bahrain
[4] Univ Key Lab Internet Things Technol & Applicat Yu, Kunming 650500, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Adaptive frequency decoupling; Contrastive learning; Associative invariant; Intrinsic specific; ARCHITECTURE; PERFORMANCE; FRAMEWORK; NEST;
D O I
10.1016/j.eswa.2024.125694
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multi-modality image fusion goal is to create a single image that provides a comprehensive scene description and conforms to visual perception by integrating complementary information about the merits of the different modalities, e.g ., salient intensities of infrared images and detail textures of visible images. Although some works explore decoupled representations of multi-modality images, they struggle with complex nonlinear relationships, fine modal decoupling, and noise handling. To cope with this issue, we propose an adaptive frequency decoupling module to perceive the associative invariant and inherent specific among cross- modality by dynamically adjusting the learnable low frequency weight of the kernel. Specifically, we utilize a contrastive learning loss for restricting the solution space of feature decoupling to learn representations of both the invariant and specific in the multi-modality images. The underlying idea is that: in decoupling, low frequency features, which are similar in the representation space, should be pulled closer to each other, signifying the associative invariant, while high frequencies are pushed farther away, also indicating the intrinsic specific. Additionally, a multi-stage training manner is introduced into our framework to achieve decoupling and fusion. Stage I, MixEncoder and MixDecoder with the same architecture but different parameters are trained to perform decoupling and reconstruction supervised by the contrastive self-supervised mechanism. Stage II, two feature fusion modules are added to integrate the invariant and specific features and output the fused image. Extensive experiments demonstrated the proposed method superiority over the state-of-the-art methods in both qualitative and quantitative evaluation on two multi-modal image fusion tasks.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] AWDF: An Adaptive Weighted Deep Fusion Architecture for Multi-modality Learning
    Xue, Qinghan
    Kolagunda, Abhishek
    Eliuk, Steven
    Wang, Xiaolong
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2503 - 2512
  • [22] CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion
    Liu, Jinyuan
    Lin, Runjia
    Wu, Guanyao
    Liu, Risheng
    Luo, Zhongxuan
    Fan, Xin
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (05) : 1748 - 1775
  • [23] Joint patch clustering-based adaptive dictionary and sparse representation for multi-modality image fusion
    Chang Wang
    Yang Wu
    Yi Yu
    Jun Qiang Zhao
    Machine Vision and Applications, 2022, 33
  • [24] Joint patch clustering-based adaptive dictionary and sparse representation for multi-modality image fusion
    Wang, Chang
    Wu, Yang
    Yu, Yi
    Zhao, Jun Qiang
    MACHINE VISION AND APPLICATIONS, 2022, 33 (05)
  • [25] An Encoder Generative Adversarial Network for Multi-modality Image Recognition
    Chen, Yu
    Yang, Chunling
    Zhu, Min
    Yang, ShiYan
    IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 2689 - 2694
  • [26] Multi-Modality Cross Attention Network for Image and Sentence Matching
    Wei, Xi
    Zhang, Tianzhu
    Li, Yan
    Zhang, Yongdong
    Wu, Feng
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10938 - 10947
  • [27] Multi-Modality Deep Network for Extreme Learned Image Compression
    Jiang, Xuhao
    Tan, Weimin
    Tan, Tian
    Yan, Bo
    Shen, Liquan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 1033 - 1041
  • [28] Deep learning supported disease detection with multi-modality image fusion
    Vinnarasi, F. Sangeetha Francelin
    Daniel, Jesline
    Rose, J. T. Anita
    Pugalenthi, R.
    JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY, 2021, 29 (03) : 411 - 434
  • [29] A novel dictionary learning approach for multi-modality medical image fusion
    Zhu, Zhiqin
    Chai, Yi
    Yin, Hongpeng
    Li, Yanxia
    Liu, Zhaodong
    NEUROCOMPUTING, 2016, 214 : 471 - 482
  • [30] Multi-Modality Medical Image Fusion using Discrete Wavelet Transform
    Bhavana, V
    Krishnappa, H. K.
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2015, 70 : 625 - 631