SwinE-Net: hybrid deep learning approach to novel polyp segmentation using convolutional neural network and Swin Transformer

被引：77

作者：

Park, Kyeong-Beom ^{[1
]}

Lee, Jae Yeol ^{[1
]}

机构：

[1] Chonnam Natl Univ, Dept Ind Engn, 77,Yongbong Ro, Gwangju 61186, South Korea

来源：

JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING | 2022年 / 9卷 / 02期

基金：

新加坡国家研究基金会;

关键词：

polyp segmentation; convolutional neural networks; multidilation convolutional block; multifeature aggregation block; Swin Transformer; Vision Transformer; COLONOSCOPY;

D O I：

10.1093/jcde/qwac018

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Prevention of colorectal cancer (CRC) by inspecting and removing colorectal polyps has become a global health priority because CRC is one of the most frequent cancers in the world. Although recent U-Net-based convolutional neural networks (CNNs) with deep feature representation and skip connections have shown to segment polyps effectively, U-Net-based approaches still have limitations in modeling explicit global contexts, due to the intrinsic nature locality of convolutional operations. To overcome these problems, this study proposes a novel deep learning model, SwinE-Net, for polyp segmentation that effectively combines a CNN-based EfficientNet and Vision Transformer (ViT)-based Swin Ttransformer. The main challenge is to conduct accurate and robust medical segmentation in maintaining global semantics without sacrificing low-level features of CNNs through Swin Transformer. First, the multidilation convolutional block generates refined feature maps to enhance feature discriminability for multilevel feature maps extracted from CNN and ViT. Then, the multifeature aggregation block creates intermediate side outputs from the refined polyp features for efficient training. Finally, the attentive deconvolutional network-based decoder upsamples the refined and combined feature maps to accurately segment colorectal polyps. We compared the proposed approach with previous state-of-the-art methods by evaluating various metrics using five public datasets (Kvasir, ClinicDB, ColonDB, ETIS, and EndoScene). The comparative evaluation, in particular, proved that the proposed approach showed much better performance in the unseen dataset, which shows the generalization and scalability in conducting polyp segmentation. Furthermore, an ablation study was performed to prove the novelty and advantage of the proposed network. The proposed approach outperformed previous studies.

引用

页码：616 / 632

页数：17

共 50 条

[1] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].

Bernal, Jorge ;

Javier Sanchez, F. ;

Fernandez-Esparrach, Gloria ;

Gil, Debora ;

Rodriguez, Cristina ;

Vilarino, Fernando .

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111

[2]

Brandao P., 2017, P SPIE MED IM, V10134

[3]

Cao H., 2021, arXiv preprint arXiv:2105.05537

[4] Twin-field quantum key distribution over a 511km optical fibre linking two distant metropolitan areas [J].

Chen, Jiu-Peng ;

Zhang, Chi ;

Liu, Yang ;

Jiang, Cong ;

Zhang, Wei-Jun ;

Han, Zhi-Yong ;

Ma, Shi-Zhao ;

Hu, Xiao-Long ;

Li, Yu-Huai ;

Liu, Hui ;

Zhou, Fei ;

Jiang, Hai-Feng ;

Chen, Teng-Yun ;

Li, Hao ;

You, Li-Xing ;

Wang, Zhen ;

Wang, Xiang-Bin ;

Zhang, Qiang ;

Pan, Jian-Wei .

NATURE PHOTONICS, 2021, 15 (08) :570-575

[5]

Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26

[6]

Dosovitskiy A, 2020, ARXIV

[7] Structure-measure: A New Way to Evaluate Foreground Maps [J].