Deep adaptive fusion with cross-modality feature transition and modality quaternion learning for medical image fusion

被引:1
|
作者
Srivastava, Somya [1 ]
Bhatia, Shaveta [2 ]
Agrawal, Arun Prakash [3 ]
Jayswal, Anant Kumar [4 ]
Godara, Jyoti [5 ]
Dubey, Gaurav [6 ]
机构
[1] ABES Engn Coll, Dept Comp Sci, Ghaziabad, UP, India
[2] Manav Rachna Int Inst Res & Studies, Faridabad, India
[3] Bennett Univ, Sch Comp Sci Engn & Technol, Greater Noida, India
[4] Amity Univ, Amity Sch Engn & Technol, Noida, UP, India
[5] Shree Guru Gobind Singh Tricentenary Univ, Dept Comp Sci Engn, Gurugram 122505, Haryana, India
[6] KIET Grp Inst, Dept Comp Sci, Ghaziabad, UP, India
关键词
Image fusion; Multimodal imaging; Attention network; Imaging data integration; Deep sparse coding; MODEL;
D O I
10.1007/s12530-024-09648-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's rapidly advancing medical landscape, the integration of information from multiple imaging modalities, known as medical fusion, stands at the forefront of diagnostic innovation. This approach combines the strengths of diverse techniques such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and single-photon emission computed tomography (SPECT) to offer a more comprehensive view of a patient's condition. Issues such as data heterogeneity, where varied resolutions and contrasts must be harmonized, complicate the seamless integration of imaging data. The complexity of interpreting fused images demands specialized training for clinicians and raises concerns about potential diagnostic errors. This work presents the deep adaptive fusion (Deep-AF) model for image fusion in multimodal biomedical scans includes MRI, CT, PET, and SPECT. This Deep-AF model integrates convolutional neural network (CNN)-based decision maps, deep sparse coding, cross-modality feature transition, and fusion techniques. Three pre-processing steps, including intensity normalization, noise reduction, and spatial registration, are initially applied to enhance alignment and quality in fused images. Non-subsampled contourlet thresholding (NSCTT) is employed to address challenges related to intensity, resolution, and contrast differences among modalities, facilitating multi-scale and multidirectional representation. Despite challenges in spatial alignment, interpretation across modalities, and model generalization, the proposed gradient-weighted class activation mapping with CNN (GradCAM-CNN) enhances interpretability by visualizing crucial regions for CNN predictions. Deep sparse coding fusion (DSCF) overcomes challenges through the adaptive learning of complex features, capturing high-level features while enforcing sparsity. The cross-modality feature transition mechanism (CMFTM) addresses variations in modality characteristics. The attention weighted averaging network (AtWANet) addresses challenges in multimodal feature fusion by dynamically assigning weights based on relevance, providing a flexible approach despite misalignment and scale variations. AtWANet's model training optimizes the fusion process by dynamically assigning attention weights to each modality, ensuring effective integration of varied representations. Simulation results obtains that the proposed Deep-AF model obtains robust fusion results in terms of statistical and accuracy metrics.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection
    Zheng, Shangpo
    Liu, Junfeng
    Jun, Zeng
    DIGITAL SIGNAL PROCESSING, 2025, 159
  • [22] Image manipulation localization via dynamic cross-modality fusion and progressive integration
    Jin, Xiao
    Yu, Wen
    Shi, Wei
    NEUROCOMPUTING, 2024, 610
  • [23] A review: Deep learning for medical image segmentation using multi-modality fusion
    Zhou, Tongxue
    Ruan, Su
    Canu, Stephane
    ARRAY, 2019, 3-4
  • [24] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Fang Qingyun
    Wang Zhaokui
    PATTERN RECOGNITION, 2022, 130
  • [25] A transformer-guided cross-modality adaptive feature fusion framework for esophageal gross tumor volume segmentation
    Yue, Yaoting
    Li, Nan
    Zhang, Gaobo
    Xing, Wenyu
    Zhu, Zhibin
    Liu, Xin
    Song, Shaoli
    Ta, Dean
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 251
  • [26] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Qingyun, Fang
    Zhaokui, Wang
    Pattern Recognition, 2022, 130
  • [27] Cross-Modality Person Re-identification Combined with Data Augmentation and Feature Fusion
    Song, Yu
    Wang, Banghai
    Cao, Ganggang
    Computer Engineering and Applications, 2024, 60 (04) : 133 - 141
  • [28] 6D Object Pose Estimation Based on Cross-Modality Feature Fusion
    Jiang, Meng
    Zhang, Liming
    Wang, Xiaohua
    Li, Shuang
    Jiao, Yijie
    SENSORS, 2023, 23 (19)
  • [29] Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering
    Cao, Jianjian
    Qin, Xiameng
    Zhao, Sanyuan
    Shen, Jianbing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022,
  • [30] Cross-Modality Feature Learning via Convolutional Autoencoder
    Liu, Xueliang
    Wang, Meng
    Zha, Zheng-Jun
    Hong, Richang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)