Diffusion model-based text-guided enhancement network for medical image segmentation

被引:14
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
收藏
页数:18
相关论文
共 57 条
[1]   VNet: An End-to-End Fully Convolutional Neural Network for Road Extraction From High-Resolution Remote Sensing Data [J].
Abdollahi, Abolfazl ;
Pradhan, Biswajeet ;
Alamri, Abdullah .
IEEE ACCESS, 2020, 8 :179424-179436
[2]   Robust clinical applicable CNN and U-Net based algorithm for MRI classification and for brain tumor [J].
Akter, Atika ;
Nosheen, Nazeela ;
Ahmed, Sabbir ;
Hossain, Mariom ;
Abu Yousuf, Mohammad ;
Almoyad, Mohammad Ali Abdullah ;
Hasan, Khondokar Fida ;
Moni, Mohammad Ali .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[3]   Edge U-Net: Brain tumor segmentation using MRI based on deep U-Net model with boundary information [J].
Allah, Ahmed M. Gab ;
Sarhan, Amany M. ;
Elshennawy, Nada M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[4]   Development of a GAN architecture based on integrating global and local information for paired and unpaired medical image translation [J].
Amirkolaee, Hamed Amini ;
Bokov, Dmitry Olegovich ;
Sharma, Himanshu .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
[5]   Fusion of U-Net and CNN model for segmentation and classification of skin lesion from dermoscopy images [J].
Anand, Vatsala ;
Gupta, Sheifali ;
Koundal, Deepika ;
Singh, Karamjeet .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[6]   DAE-Former: Dual Attention-Guided Efficient Transformer for Medical Image Segmentation [J].
Azad, Reza ;
Arimond, Rene ;
Aghdam, Ehsan Khodapanah ;
Kazerouni, Amirhossein ;
Merhof, Dorit .
PREDICTIVE INTELLIGENCE IN MEDICINE, PRIME 2023, 2023, 14277 :83-95
[7]   An Automatic Nucleus Segmentation and CNN Model based Classification Method of White Blood Cell [J].
Banik, Partha Pratim ;
Saha, Rappy ;
Kim, Ki-Doo .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 149
[8]  
Chen J., 2021, arXiv
[9]   Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models [J].
Chung, Hyungjin ;
Ryu, Dohoon ;
Mccann, Michael T. ;
Klasky, Marc L. ;
Ye, Jong Chul .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22542-22551
[10]  
Dhariwal P, 2021, ADV NEUR IN, V34