MM-Net: A MixFormer-Based Multi-Scale Network for Anatomical and Functional Image Fusion

被引:17
作者
Liu, Yu [1 ,2 ]
Yu, Chen [1 ,2 ]
Cheng, Juan [1 ,2 ]
Wang, Z. Jane [3 ]
Chen, Xun [4 ]
机构
[1] Hefei Univ Technol, Dept Biomed Engn, Hefei 230009, Peoples R China
[2] Hefei Univ Technol, Anhui Prov Key Lab Measuring Theory & Precis Inst, Hefei 230009, Peoples R China
[3] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
[4] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical image fusion; transformer; convolutional neural networks; multi-scale feature learning; QUALITY ASSESSMENT; PERFORMANCE; INFORMATION; FRAMEWORK;
D O I
10.1109/TIP.2024.3374072
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anatomical and functional image fusion is an important technique in a variety of medical and biological applications. Recently, deep learning (DL)-based methods have become a mainstream direction in the field of multi-modal image fusion. However, existing DL-based fusion approaches have difficulty in effectively capturing local features and global contextual information simultaneously. In addition, the scale diversity of features, which is a crucial issue in image fusion, often lacks adequate attention in most existing works. In this paper, to address the above problems, we propose a MixFormer-based multi-scale network, termed as MM-Net, for anatomical and functional image fusion. In our method, an improved MixFormer-based backbone is introduced to sufficiently extract both local features and global contextual information at multiple scales from the source images. The features from different source images are fused at multiple scales based on a multi-source spatial attention-based cross-modality feature fusion (CMFF) module. The scale diversity of the fused features is further enriched by a series of multi-scale feature interaction (MSFI) modules and feature aggregation upsample (FAU) modules. Moreover, a loss function consisting of both spatial domain and frequency domain components is devised to train the proposed fusion model. Experimental results demonstrate that our method outperforms several state-of-the-art fusion methods on both qualitative and quantitative comparisons, and the proposed fusion model exhibits good generalization capability. The source code of our fusion method will be available at https://github.com/yuliu316316.
引用
收藏
页码:2197 / 2212
页数:16
相关论文
共 80 条
[1]   Magnetic resonance imaging and ultrasound fusion technique in gynecology [J].
Bazot, M. ;
Spagnoli, F. ;
Guerriero, S. .
ULTRASOUND IN OBSTETRICS & GYNECOLOGY, 2022, 59 (02) :141-145
[2]   A Fractal Dimension Based Framework for Night Vision Fusion [J].
Bhatnagar, Gaurav ;
Wu, Q. M. Jonathan .
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 6 (01) :220-227
[3]   Directive Contrast Based Multimodal Medical Image Fusion in NSCT Domain [J].
Bhatnagar, Gaurav ;
Wu, Q. M. Jonathan ;
Liu, Zheng .
IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (05) :1014-1024
[4]   A human perception inspired quality metric for image fusion based on regional information [J].
Chen, Hao ;
Varshney, Pramod K. .
INFORMATION FUSION, 2007, 8 (02) :193-207
[5]   MixFormer: Mixing Features acrossWindows and Dimensions [J].
Chen, Qiang ;
Wu, Qiman ;
Wang, Jian ;
Hu, Qinghao ;
Hu, Tao ;
Ding, Errui ;
Cheng, Jian ;
Wang, Jingdong .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5239-5249
[6]   A new automated quality assessment algorithm for image fusion [J].
Chen, Yin ;
Blum, Rick S. .
IMAGE AND VISION COMPUTING, 2009, 27 (10) :1421-1432
[7]   MUFusion: A general unsupervised image fusion network based on memory unit [J].
Cheng, Chunyang ;
Xu, Tianyang ;
Wu, Xiao-Jun .
INFORMATION FUSION, 2023, 92 :80-92
[8]  
Cvejic N., 2006, Int. J. Signal Process., V2, P178
[9]   A Neuro-Fuzzy Approach for Medical Image Fusion [J].
Das, Sudeb ;
Kundu, Malay Kumar .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2013, 60 (12) :3347-3353
[10]   Multi-Scale Separable Network for Ultra-High-Definition Video Deblurring [J].
Deng, Senyou ;
Ren, Wenqi ;
Yan, Yanyang ;
Wang, Tao ;
Song, Fenglong ;
Cao, Xiaochun .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :14010-14019