DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

被引：72

作者：

Zhao, Zixiang ^{[1
,2
]}

Bai, Haowen ^{[1
]}

Zhu, Yuanzhi ^{[2
]}

Zhang, Jiangshe ^{[1
]}

Xu, Shuang

Zhang, Yulun ^{[2
]}

Zhang, Kai ^{[2
]}

Meng, Deyu

Timofte, Radu ^{[2
]}

Van Gool, Luc ^{[2
]}

机构：

[1] Xi An Jiao Tong Univ, Xian, Peoples R China

[2] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

基金：

中国国家自然科学基金;

关键词：

NETWORK; NEST;

D O I：

10.1109/ICCV51070.2023.00742

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-modality image fusion aims to combine different modalities to produce fused images that retain the complementary features of each modality, such as functional highlights and texture details. To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM). The fusion task is formulated as a conditional generation problem under the DDPM sampling framework, which is further divided into an unconditional generation subproblem and a maximum likelihood subproblem. The latter is modeled in a hierarchical Bayesian manner with latent variables and inferred by the expectation-maximization (EM) algorithm. By integrating the inference solution into the diffusion sampling iteration, our method can generate high-quality fused images with natural image generative priors and cross-modality information from source images. Note that all we required is an unconditional pre-trained generative model, and no fine-tuning is needed. Our extensive experiments indicate that our approach yields promising fusion results in infrared-visible image fusion and medical image fusion. The code is available at https://github. com/Zhaozixiang1228/MMIF-DDFM.

引用

页码：8048 / 8059

页数：12

共 76 条

[1] Anderson B. D., 1982, Stochastic Processes and their Applications, V12, P313, DOI DOI 10.1016/0304-4149(82)90051-5
[2] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3] Chung Hyungjin, 2023, ICLR
[4] Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion
Deng, Xin
Dragotti, Pier Luigi
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3333 - 3348
[5] Dhariwal P, 2021, ADV NEUR IN, V34
[6] Multi-Modal Convolutional Dictionary Learning
Gao, Fangyuan
Deng, Xin
Xu, Mai
Xu, Jingyi
Dragotti, Pier Luigi
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1325 - 1339
[7] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[8] Harvard Medical website, About us
[9] Camouflaged Object Detection with Feature Decomposition and Edge Reconstruction
He, Chunming
Li, Kai
Zhang, Yachao
Tang, Longxiang
Zhang, Yulun
Guo, Zhenhua
Li, Xiu
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22046 - 22055
[10] He CM, 2024, Arxiv, DOI arXiv:2308.03166

← 1 2 3 4 5 6 7 8 →