RSAFormer: A method of polyp segmentation with region self-attention transformer

被引:2
作者
Yin X. [1 ]
Zeng J. [1 ]
Hou T. [1 ]
Tang C. [1 ]
Gan C. [2 ]
Jain D.K. [3 ,4 ]
García S. [5 ]
机构
[1] School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing
[2] School of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, Chongqing
[3] Key Laboratory of Intelligent Control and Optimization for Industrial Equipment of Ministry of Education, Dalian University of Technology, Dalian
[4] Symbiosis Institute of Technology, Symbiosis International University, Pune
[5] Department of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada
基金
中国国家自然科学基金;
关键词
Colonoscopy; Polyp segmentation; Region self-attention; Transformer;
D O I
10.1016/j.compbiomed.2024.108268
中图分类号
学科分类号
摘要
Colonoscopy has attached great importance to early screening and clinical diagnosis of colon cancer. It remains a challenging task to achieve fine segmentation of polyps. However, existing State-of-the-art models still have limited segmentation ability due to the lack of clear and highly similar boundaries between normal tissue and polyps. To deal with this problem, we propose a region self-attention enhancement network (RSAFormer) with a transformer encoder to capture more robust features. Different from other excellent methods, RSAFormer uniquely employs a dual decoder structure to generate various feature maps. Contrasting with traditional methods that typically employ a single decoder, it offers more flexibility and detail in feature extraction. RSAFormer also introduces a region self-attention enhancement module (RSA) to acquire more accurate feature information and foster a stronger interplay between low-level and high-level features. This module enhances uncertain areas to extract more precise boundary information, these areas being signified by regional context. Extensive experiments were conducted on five prevalent polyp datasets to demonstrate RSAFormer's proficiency. It achieves 92.2% and 83.5% mean Dice on Kvasir and ETIS, respectively, which outperformed most of the state-of-the-art models. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 45 条
[1]  
Long J., Shelhamer E., Darrell T., Fully convolutional networks for semantic segmentation, pp. 3431-3440, (2015)
[2]  
Ronneberger O., Fischer P., Brox T., U-Net: Convolutional networks for biomedical image segmentation, pp. 234-241, (2015)
[3]  
Zhou Z., Rahman Siddiquee M.M., Tajbakhsh N., Liang J., UNet++: Redesigning skip connections to exploit multiscale features in image segmentation, IEEE Trans. Med. Imaging, 39, 6, pp. 1856-1867, (2020)
[4]  
Jha D., Smedsrud P.H., Riegler M.A., Johansen D., De Lange T., Halvorsen P., Johansen H.D., pp. 225-2255, (2019)
[5]  
Zhang Z., Liu Q., Wang Y., Road extraction by deep residual U-net, IEEE Geosci. Remote Sens. Lett., 15, 5, pp. 749-753, (2018)
[6]  
Fan D.-P., Ji G.-P., Zhou T., Chen G., Fu H., Shen J., Shao L., Pranet: Parallel reverse attention network for polyp segmentation, pp. 263-273, (2020)
[7]  
Gao S.-H., Cheng M.-M., Zhao K., Zhang X.-Y., Yang M.-H., Torr P., Res2Net: A new multi-scale backbone architecture, IEEE Trans. Pattern Anal. Mach. Intell., 43, 2, pp. 652-662, (2020)
[8]  
Zhang R., Li G., Li Z., Cui S., Qian D., Yu Y., Adaptive context selection for polyp segmentation, pp. 253-262, (2020)
[9]  
Nguyen T.-C., Nguyen T.-P., Diep G.-H., Tran-Dinh A.-H., Nguyen T.V., Tran M.-T., CCBANet: Cascading context and balancing attention for polyp segmentation, pp. 633-643, (2021)
[10]  
Wang W., Xie E., Li X., Fan D.-P., Song K., Liang D., Lu T., Luo P., Shao L., Pvtv2: Improved baselines with pyramid vision transformer, Comput. Vis. Media, 8, 3, pp. 1-10, (2022)