PSTNet: Enhanced Polyp Segmentation With Multi-Scale Alignment and Frequency Domain Integration

被引:3
|
作者
Xu, Wenhao [1 ]
Xu, Rongtao [2 ,3 ]
Wang, Changwei [4 ,5 ,6 ]
Li, Xiuli [7 ]
Xu, Shibiao [1 ]
Guo, Li [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[2] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing, Peoples R China
[3] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[4] Qilu Univ Technol, Shandong Acad Sci, Key Lab Comp Power Network & Informat Secur, Minist Educ,Shandong Comp Sci Ctr,Natl Supercomp C, Jinan 250013, Peoples R China
[5] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Networks, Jinan 250013, Peoples R China
[6] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100876, Peoples R China
[7] Deepwise Healthcare, AI Lab, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Feature extraction; Transformers; Accuracy; Frequency-domain analysis; Location awareness; Colonoscopy; Polyp segmentation; shunted transformer; multi-scale fusion; VALIDATION;
D O I
10.1109/JBHI.2024.3421550
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management of colorectal cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations in accurately identifying polyps due to restricted RGB domain information and challenges in feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach that integrates both RGB and frequency domain cues present in the images. PSTNet comprises three key modules: the Frequency Characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing misalignment noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation. Extensive experiments on challenging datasets demonstrate PSTNet's significant improvement in polyp segmentation accuracy across various metrics, consistently outperforming state-of-the-art methods. The integration of frequency domain cues and the novel architectural design of PSTNet contribute to advancing computer-assisted polyp segmentation, facilitating more accurate diagnosis and management of CRC.
引用
收藏
页码:6042 / 6053
页数:12
相关论文
共 50 条
  • [31] MBDA-Net: Multi-source boundary-aware prototype alignment domain adaptation for polyp segmentation
    Yan, Jiawei
    Zhu, Hongqing
    Hou, Tong
    Chen, Ning
    Lu, Weiping
    Wang, Ying
    Huang, Bingcang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96
  • [32] Learning Semantic Alignment Using Global Features and Multi-Scale Confidence
    Xu, Huaiyuan
    Liao, Jing
    Liu, Huaping
    Sun, Yuxiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 897 - 910
  • [33] Multi-Scale Network for Thoracic Organs Segmentation
    Khalil, Muhammad Ibrahim
    Tehsin, Samabia
    Humayun, Mamoona
    Jhanjhi, N. Z.
    AlZain, Mohammed A.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (02): : 3251 - 3265
  • [34] SpEx: Multi-Scale Time Domain Speaker Extraction Network
    Xu, Chenglin
    Rao, Wei
    Chng, Eng Siong
    Li, Haizhou
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1370 - 1384
  • [35] Multi-scale image segmentation based on contourlet-domain hidden Markov trees model
    Sha, YH
    Cong, L
    Sun, Q
    Jiao, LC
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2005, 24 (06) : 472 - 476
  • [36] Multi-Scale Fusion U-Net for the Segmentation of Breast Lesions
    Li, Jingyao
    Cheng, Lianglun
    Xia, Tingjian
    Ni, Haomin
    Li, Jiao
    IEEE ACCESS, 2021, 9 : 137125 - 137139
  • [37] Multi-Scale Convolutional Features Network for Semantic Segmentation in Indoor Scenes
    Wang, Yanran
    Chen, Qingliang
    Chen, Shilang
    Wu, Junjun
    IEEE ACCESS, 2020, 8 : 89575 - 89583
  • [38] Multi-Scale Neighborhood Feature Extraction and Aggregation for Point Cloud Segmentation
    Li, Dawei
    Shi, Guoliang
    Wu, Yuhao
    Yang, Yanping
    Zhao, Mingbo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (06) : 2175 - 2191
  • [39] Multi-Scale Self-Guided Attention for Medical Image Segmentation
    Sinha, Ashish
    Dolz, Jose
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (01) : 121 - 130
  • [40] MUSTER: A Multi-Scale Transformer-Based Decoder for Semantic Segmentation
    Xu, Jing
    Shi, Wentao
    Gao, Pan
    Li, Qizhu
    Wang, Zhengwei
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 202 - 212