MAINet: Modality-Aware Interaction Network for Medical Image Fusion

被引:0
作者
Wei, Lisi [1 ,2 ,3 ]
Zhao, Libo [1 ,3 ]
Zhang, Xiaoli [1 ,3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[2] Hulunbuir Univ, Coll Artificial Intelligence & Big Data, Hulunbuir, Peoples R China
[3] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun, Peoples R China
关键词
Multimodal fusion; Medical image fusion; Cascade modality interaction; Modality-awareness; Graph convolutional network; QUALITY ASSESSMENT; INFORMATION; ATTENTION; ALGORITHM; ENSEMBLE;
D O I
10.1145/3731247
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the limitations of imaging sensors, obtaining a medical image that simultaneously captures both functional metabolic data and structural tissue details remains a significant challenge in clinical diagnosis. To address this, Multimodal Medical Image Fusion (MMIF) has emerged as an effective technique for integrating complementary information from multimodal source images, such as CT, PET, and SPECT, which is critical for providing a comprehensive understanding of both anatomical and functional aspects of the human body. One of the key challenges in MMIF is how to exchange and aggregate this multimodal information. This article rethinks MMIF by addressing the harmony of modality gaps and proposes a novel Modality-Aware Interaction Network (MAINet), which leverages cross-modal feature interaction and progressively fuses multiple features in graph space. Specifically, we introduce two key modules: the Cascade Modality Interaction (CMI) module and the Dual-Graph Learning (DGL) module. The CMI module, integrated within a multi-scale encoder with triple branches, facilitates complementary multimodal feature learning and provides beneficial feedback to enhance discriminative feature learning across modalities. In the decoding process, the DGL module aggregates hierarchical features in two distinct graph spaces, enabling global feature interactions. Moreover, the DGL module incorporates a bottom-up guidance mechanism, where deeper semantic features guide the learning of shallower detail features, thus improving the fusion process by enhancing both scale diversity and modality awareness for visual fidelity results. Experimental results on medical image datasets demonstrate the superiority of the proposed method over existing fusion approaches in both subjective and objective evaluations. We also validated the performance of the proposed method in applications such as infrared-visible image fusion and medical image segmentation.
引用
收藏
页数:23
相关论文
共 59 条
[31]   Cross-scale Graph Interaction Network for Semantic Segmentation of Remote Sensing Images [J].
Nie, Jie ;
Huang, Lei ;
Zheng, Chengyu ;
Lv, Xiaowei ;
Wang, Rui .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
[32]   Multiscale fusion of multimodal medical images using lifting scheme based biorthogonal wavelet transform [J].
Prakash, Om ;
Park, Chang Min ;
Khare, Ashish ;
Jeon, Moongu ;
Gwak, Jeonghwan .
OPTIK, 2019, 182 :995-1014
[33]   Multimodal image fusion using sparse representation classification in tetrolet domain [J].
Shandoosti, Hamid Reza ;
Mehrabi, Adel .
DIGITAL SIGNAL PROCESSING, 2018, 79 :9-22
[34]   Image information and visual quality [J].
Sheikh, HR ;
Bovik, AC .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2006, 15 (02) :430-444
[35]   Improving RGB-D Salient Object Detection via Modality-Aware Decoder [J].
Song, Mengke ;
Song, Wenfeng ;
Yang, Guowei ;
Chen, Chenglizhao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :6124-6138
[36]   Image fusion for the novelty rotating synthetic aperture system based on vision transformer [J].
Sun, Yu ;
Zhi, Xiyang ;
Jiang, Shikai ;
Fan, Guanghua ;
Yan, Xu ;
Zhang, Wei .
INFORMATION FUSION, 2024, 104
[37]   Multimodal medical image fusion algorithm in the era of big data [J].
Tan, Wei ;
Tiwari, Prayag ;
Pandey, Hari Mohan ;
Moreira, Catarina ;
Jaiswal, Amit Kumar .
NEURAL COMPUTING & APPLICATIONS, 2020,
[38]   Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity [J].
Tang, Linfeng ;
Zhang, Hao ;
Xu, Han ;
Ma, Jiayi .
INFORMATION FUSION, 2023, 99
[39]   UNeXt: MLP-Based Rapid Medical Image Segmentation Network [J].
Valanarasu, Jeya Maria Jose ;
Patel, Vishal M. .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 :23-33
[40]   Medical Transformer: Gated Axial-Attention for Medical Image Segmentation [J].
Valanarasu, Jeya Maria Jose ;
Oza, Poojan ;
Hacihaliloglu, Ilker ;
Patel, Vishal M. .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 :36-46