共 50 条
Deep adaptive fusion with cross-modality feature transition and modality quaternion learning for medical image fusion
被引:1
|作者:
Srivastava, Somya
[1
]
Bhatia, Shaveta
[2
]
Agrawal, Arun Prakash
[3
]
Jayswal, Anant Kumar
[4
]
Godara, Jyoti
[5
]
Dubey, Gaurav
[6
]
机构:
[1] ABES Engn Coll, Dept Comp Sci, Ghaziabad, UP, India
[2] Manav Rachna Int Inst Res & Studies, Faridabad, India
[3] Bennett Univ, Sch Comp Sci Engn & Technol, Greater Noida, India
[4] Amity Univ, Amity Sch Engn & Technol, Noida, UP, India
[5] Shree Guru Gobind Singh Tricentenary Univ, Dept Comp Sci Engn, Gurugram 122505, Haryana, India
[6] KIET Grp Inst, Dept Comp Sci, Ghaziabad, UP, India
关键词:
Image fusion;
Multimodal imaging;
Attention network;
Imaging data integration;
Deep sparse coding;
MODEL;
D O I:
10.1007/s12530-024-09648-8
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
In today's rapidly advancing medical landscape, the integration of information from multiple imaging modalities, known as medical fusion, stands at the forefront of diagnostic innovation. This approach combines the strengths of diverse techniques such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and single-photon emission computed tomography (SPECT) to offer a more comprehensive view of a patient's condition. Issues such as data heterogeneity, where varied resolutions and contrasts must be harmonized, complicate the seamless integration of imaging data. The complexity of interpreting fused images demands specialized training for clinicians and raises concerns about potential diagnostic errors. This work presents the deep adaptive fusion (Deep-AF) model for image fusion in multimodal biomedical scans includes MRI, CT, PET, and SPECT. This Deep-AF model integrates convolutional neural network (CNN)-based decision maps, deep sparse coding, cross-modality feature transition, and fusion techniques. Three pre-processing steps, including intensity normalization, noise reduction, and spatial registration, are initially applied to enhance alignment and quality in fused images. Non-subsampled contourlet thresholding (NSCTT) is employed to address challenges related to intensity, resolution, and contrast differences among modalities, facilitating multi-scale and multidirectional representation. Despite challenges in spatial alignment, interpretation across modalities, and model generalization, the proposed gradient-weighted class activation mapping with CNN (GradCAM-CNN) enhances interpretability by visualizing crucial regions for CNN predictions. Deep sparse coding fusion (DSCF) overcomes challenges through the adaptive learning of complex features, capturing high-level features while enforcing sparsity. The cross-modality feature transition mechanism (CMFTM) addresses variations in modality characteristics. The attention weighted averaging network (AtWANet) addresses challenges in multimodal feature fusion by dynamically assigning weights based on relevance, providing a flexible approach despite misalignment and scale variations. AtWANet's model training optimizes the fusion process by dynamically assigning attention weights to each modality, ensuring effective integration of varied representations. Simulation results obtains that the proposed Deep-AF model obtains robust fusion results in terms of statistical and accuracy metrics.
引用
收藏
页数:26
相关论文