AFDFusion: An adaptive frequency decoupling fusion network for multi-modality image

被引:2
|
作者
Wang, Chengchao [1 ]
Zhao, Zhengpeng [1 ]
Yang, Qiuxia [1 ]
Nie, Rencan [1 ]
Cao, Jinde [2 ,3 ]
Pu, Yuanyuan [1 ,4 ]
机构
[1] Yunnan Univ, Coll Informat Sci & Engn, Kunming 650500, Peoples R China
[2] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[3] Ahlia Univ, Manama, Bahrain
[4] Univ Key Lab Internet Things Technol & Applicat Yu, Kunming 650500, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Adaptive frequency decoupling; Contrastive learning; Associative invariant; Intrinsic specific; ARCHITECTURE; PERFORMANCE; FRAMEWORK; NEST;
D O I
10.1016/j.eswa.2024.125694
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multi-modality image fusion goal is to create a single image that provides a comprehensive scene description and conforms to visual perception by integrating complementary information about the merits of the different modalities, e.g ., salient intensities of infrared images and detail textures of visible images. Although some works explore decoupled representations of multi-modality images, they struggle with complex nonlinear relationships, fine modal decoupling, and noise handling. To cope with this issue, we propose an adaptive frequency decoupling module to perceive the associative invariant and inherent specific among cross- modality by dynamically adjusting the learnable low frequency weight of the kernel. Specifically, we utilize a contrastive learning loss for restricting the solution space of feature decoupling to learn representations of both the invariant and specific in the multi-modality images. The underlying idea is that: in decoupling, low frequency features, which are similar in the representation space, should be pulled closer to each other, signifying the associative invariant, while high frequencies are pushed farther away, also indicating the intrinsic specific. Additionally, a multi-stage training manner is introduced into our framework to achieve decoupling and fusion. Stage I, MixEncoder and MixDecoder with the same architecture but different parameters are trained to perform decoupling and reconstruction supervised by the contrastive self-supervised mechanism. Stage II, two feature fusion modules are added to integrate the invariant and specific features and output the fused image. Extensive experiments demonstrated the proposed method superiority over the state-of-the-art methods in both qualitative and quantitative evaluation on two multi-modal image fusion tasks.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Evidence Fusion with Contextual Discounting for Multi-modality Medical Image Segmentation
    Huang, Ling
    Denoeux, Thierry
    Vera, Pierre
    Ruan, Su
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 401 - 411
  • [32] Multi-Modality Image Fusion and Object Detection Based on Semantic Information
    Liu, Yong
    Zhou, Xin
    Zhong, Wei
    ENTROPY, 2023, 25 (05)
  • [33] Research on image affection tagging based on multi-modality information fusion
    Tang Z.
    Liu X.
    Yang H.
    Lu C.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2020, 26 (01): : 134 - 144
  • [34] Multi-modality image fusion combining sparse representation with guidance filtering
    Hu, Qiu
    Hu, Shaohai
    Zhang, Fengzhen
    SOFT COMPUTING, 2021, 25 (06) : 4393 - 4407
  • [35] Multi-Modality Adaptive Feature Fusion Graph Convolutional Network for Skeleton-Based Action Recognition
    Zhang, Haiping
    Zhang, Xinhao
    Yu, Dongjin
    Guan, Liming
    Wang, Dongjing
    Zhou, Fuxing
    Zhang, Wanjun
    SENSORS, 2023, 23 (12)
  • [36] GCN-Based Multi-Modality Fusion Network for Action Recognition
    Liu, Shaocan
    Wang, Xingtao
    Xiong, Ruiqin
    Fan, Xiaopeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1242 - 1253
  • [37] A novel multi-modality image fusion method based on image decomposition and sparse representation
    Zhu, Zhiqin
    Yin, Hongpeng
    Chai, Yi
    Li, Yanxia
    Qi, Guanqiu
    INFORMATION SCIENCES, 2018, 432 : 516 - 529
  • [38] MMFNet: A multi-modality MRI fusion network for segmentation of nasopharyngeal carcinoma
    Chen, Huai
    Qi, Yuxiao
    Yin, Yong
    Li, Tengxiang
    Liu, Xiaoqing
    Li, Xiuli
    Gong, Guanzhong
    Wang, Lisheng
    NEUROCOMPUTING, 2020, 394 : 27 - 40
  • [39] Multi-Modality Global Fusion Attention Network for Visual Question Answering
    Yang, Cheng
    Wu, Weijia
    Wang, Yuxing
    Zhou, Hong
    ELECTRONICS, 2020, 9 (11) : 1 - 12
  • [40] Image retrieval with a multi-modality ontology
    Huan Wang
    Song Liu
    Liang-Tien Chia
    Multimedia Systems, 2008, 13 : 379 - 390