CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

被引:259
作者
Zhao, Zixiang [1 ,2 ]
Bai, Haowen [1 ]
Zhang, Jiangshe [1 ]
Zhang, Yulun [2 ]
Xu, Shuang [3 ,4 ]
Lin, Zudi [5 ]
Timofte, Radu [2 ,6 ]
Van Gool, Luc [2 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland
[3] Northwestern Polytech Univ Shenzhen, Inst Res & Dev, Shenzhen, Peoples R China
[4] Northwestern Polytech Univ, Xian, Peoples R China
[5] Harvard Univ, Cambridge, England
[6] Univ Wurzburg, Wurzburg, Germany
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
中国国家自然科学基金;
关键词
NETWORK;
D O I
10.1109/CVPR52729.2023.00572
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. Firstly, CDDFuse uses Restormer blocks to extract cross-modality shallow features. We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information. A correlation-driven loss is further proposed to make the low-frequency features correlated while the high-frequency features un-correlated based on the embedded information. Then, the LT-based global fusion and INN-based local fusion layers output the fused image. Extensive experiments demonstrate that our CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion. We also show that CDDFuse can boost the performance in downstream infrared-visible semantic segmentation and object detection in a unified benchmark. The code is available at https://github.com/Zhaozixiang1228/MMIF-CDDFuse.
引用
收藏
页码:5906 / 5916
页数:11
相关论文
共 96 条
  • [21] Jacobsen Jorn-Henrik, 2018, ARXIV180207088
  • [22] Medical image fusion: A survey of the state of the art
    James, Alex Pappachen
    Dasarathy, Belur V.
    [J]. INFORMATION FUSION, 2014, 19 : 4 - 19
  • [23] Towards All Weather and Unobstructed Multi-Spectral Image Stitching: Algorithm and Benchmark
    Jiang, Zhiying
    Zhang, Zengxi
    Fan, Xin
    Liu, Risheng
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3783 - 3791
  • [24] HiNet: Deep Image Hiding by Invertible Network
    Jing, Junpeng
    Deng, Xin
    Xu, Mai
    Wang, Jianyi
    Guan, Zhenyu
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 4713 - 4722
  • [25] Jocher G., 2020, ultralytics/yolov5: v3. 0
  • [26] Transformer-based Label Set Generation for Multi-modal Multi-label Emotion Detection
    Ju, Xincheng
    Zhang, Dong
    Li, Junhui
    Zhou, Guodong
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 512 - 520
  • [27] Unsupervised Deep Image Fusion With Structure Tensor Representations
    Jung, Hyungjoo
    Kim, Youngjung
    Jang, Hyunsung
    Ha, Namkoo
    Sohn, Kwanghoon
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 3845 - 3858
  • [28] Kingma DP, 2018, 32 C NEURAL INFORM P
  • [29] RFN-Nest: An end-to-end residual fusion network for infrared and visible images
    Li, Hui
    Wu, Xiao-Jun
    Kittler, Josef
    [J]. INFORMATION FUSION, 2021, 73 : 72 - 86
  • [30] DenseFuse: A Fusion Approach to Infrared and Visible Images
    Li, Hui
    Wu, Xiao-Jun
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (05) : 2614 - 2623