RSAFormer: A method of polyp segmentation with region self-attention transformer

被引:2
作者
Yin X. [1 ]
Zeng J. [1 ]
Hou T. [1 ]
Tang C. [1 ]
Gan C. [2 ]
Jain D.K. [3 ,4 ]
García S. [5 ]
机构
[1] School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing
[2] School of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, Chongqing
[3] Key Laboratory of Intelligent Control and Optimization for Industrial Equipment of Ministry of Education, Dalian University of Technology, Dalian
[4] Symbiosis Institute of Technology, Symbiosis International University, Pune
[5] Department of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada
基金
中国国家自然科学基金;
关键词
Colonoscopy; Polyp segmentation; Region self-attention; Transformer;
D O I
10.1016/j.compbiomed.2024.108268
中图分类号
学科分类号
摘要
Colonoscopy has attached great importance to early screening and clinical diagnosis of colon cancer. It remains a challenging task to achieve fine segmentation of polyps. However, existing State-of-the-art models still have limited segmentation ability due to the lack of clear and highly similar boundaries between normal tissue and polyps. To deal with this problem, we propose a region self-attention enhancement network (RSAFormer) with a transformer encoder to capture more robust features. Different from other excellent methods, RSAFormer uniquely employs a dual decoder structure to generate various feature maps. Contrasting with traditional methods that typically employ a single decoder, it offers more flexibility and detail in feature extraction. RSAFormer also introduces a region self-attention enhancement module (RSA) to acquire more accurate feature information and foster a stronger interplay between low-level and high-level features. This module enhances uncertain areas to extract more precise boundary information, these areas being signified by regional context. Extensive experiments were conducted on five prevalent polyp datasets to demonstrate RSAFormer's proficiency. It achieves 92.2% and 83.5% mean Dice on Kvasir and ETIS, respectively, which outperformed most of the state-of-the-art models. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 45 条
[31]  
Huang G., Liu Z., Van Der Maaten L., Weinberger K.Q., pp. 2261-2269, (2017)
[32]  
Chao P., Kao C.-Y., Ruan Y., Huang C.-H., Lin Y.-L., HarDNet: A low memory traffic network, pp. 3551-3560, (2019)
[33]  
Xiao T., Liu Y., Zhou B., Jiang Y., Sun J., Unified Perceptual Parsing for Scene Understanding, pp. 432-448, (2018)
[34]  
Lou A., Loew M.H., CFPNET: Channel-Wise Feature Pyramid For Real-Time Semantic Segmentation, pp. 1894-1898, (2021)
[35]  
Woo S., Park J., Lee J.-Y., Kweon I.S., CBAM: Convolutional block attention module, pp. 3-19, (2018)
[36]  
Yuan Y., Chen X., Wang J., Object-contextual representations for semantic segmentation, pp. 173-190, (2020)
[37]  
Dong B., Wang W., Fan D.-P., Li J., Fu H., Shao L., Polyp-pvt: Polyp segmentation with pyramid vision transformers, (2021)
[38]  
Jha D., Smedsrud P.H., Riegler M.A., Halvorsen P., de Lange T., Johansen D., Johansen H.D., pp. 451-462, (2020)
[39]  
Bernal J., Sanchez F.J., Fernandez-Esparrach G., Gil D., Rodriguez C., Vilarino F., WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians, Comput. Med. Imaging Graph., 43, pp. 99-111, (2015)
[40]  
Tajbakhsh N., Gurudu S.R., Liang J., Automated polyp detection in colonoscopy videos using shape and context information, IEEE Trans. Med. Imaging, 35, 2, pp. 630-644, (2016)