SwinE-Net: hybrid deep learning approach to novel polyp segmentation using convolutional neural network and Swin Transformer

被引:69
作者
Park, Kyeong-Beom [1 ]
Lee, Jae Yeol [1 ]
机构
[1] Chonnam Natl Univ, Dept Ind Engn, 77,Yongbong Ro, Gwangju 61186, South Korea
基金
新加坡国家研究基金会;
关键词
polyp segmentation; convolutional neural networks; multidilation convolutional block; multifeature aggregation block; Swin Transformer; Vision Transformer; COLONOSCOPY;
D O I
10.1093/jcde/qwac018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Prevention of colorectal cancer (CRC) by inspecting and removing colorectal polyps has become a global health priority because CRC is one of the most frequent cancers in the world. Although recent U-Net-based convolutional neural networks (CNNs) with deep feature representation and skip connections have shown to segment polyps effectively, U-Net-based approaches still have limitations in modeling explicit global contexts, due to the intrinsic nature locality of convolutional operations. To overcome these problems, this study proposes a novel deep learning model, SwinE-Net, for polyp segmentation that effectively combines a CNN-based EfficientNet and Vision Transformer (ViT)-based Swin Ttransformer. The main challenge is to conduct accurate and robust medical segmentation in maintaining global semantics without sacrificing low-level features of CNNs through Swin Transformer. First, the multidilation convolutional block generates refined feature maps to enhance feature discriminability for multilevel feature maps extracted from CNN and ViT. Then, the multifeature aggregation block creates intermediate side outputs from the refined polyp features for efficient training. Finally, the attentive deconvolutional network-based decoder upsamples the refined and combined feature maps to accurately segment colorectal polyps. We compared the proposed approach with previous state-of-the-art methods by evaluating various metrics using five public datasets (Kvasir, ClinicDB, ColonDB, ETIS, and EndoScene). The comparative evaluation, in particular, proved that the proposed approach showed much better performance in the unseen dataset, which shows the generalization and scalability in conducting polyp segmentation. Furthermore, an ablation study was performed to prove the novelty and advantage of the proposed network. The proposed approach outperformed previous studies.
引用
收藏
页码:616 / 632
页数:17
相关论文
共 50 条
[1]   WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].
Bernal, Jorge ;
Javier Sanchez, F. ;
Fernandez-Esparrach, Gloria ;
Gil, Debora ;
Rodriguez, Cristina ;
Vilarino, Fernando .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111
[2]  
Brandao P., 2017, P SPIE MED IM, V10134
[3]  
Cao Hu, 2021, ARXIV210505537
[4]  
Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26
[5]  
Dosovitskiy A., 2020, INT C LEARN REPR
[6]   Structure-measure: A New Way to Evaluate Foreground Maps [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Yun ;
Li, Tao ;
Borji, Ali .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567
[7]   Inf-Net: Automatic COVID-19 Lung Infection Segmentation From CT Images [J].
Fan, Deng-Ping ;
Zhou, Tao ;
Ji, Ge-Peng ;
Zhou, Yi ;
Chen, Geng ;
Fu, Huazhu ;
Shen, Jianbing ;
Shao, Ling .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (08) :2626-2637
[8]   MA-Net: A Multi-Scale Attention Network for Liver and Tumor Segmentation [J].
Fan, Tongle ;
Wang, Guanglei ;
Li, Yan ;
Wang, Hongrui .
IEEE ACCESS, 2020, 8 (08) :179656-179665
[9]   Attentive Feedback Network for Boundary-Aware Salient Object Detection [J].
Feng, Mengyang ;
Lu, Huchuan ;
Ding, Errui .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1623-1632
[10]  
Ferlay J., 2010, GLOBOCAN 2008 V1 2 C