Ensembles of Convolutional Neural Networks and Transformers for Polyp Segmentation

被引：15

作者：

Nanni, Loris ^{[1
]}

Fantozzi, Carlo ^{[1
]}

Loreggia, Andrea ^{[2
]}

Lumini, Alessandra ^{[3
]}

机构：

[1] Univ Padua, Dept Informat Engn, I-35122 Padua, Italy

[2] Univ Brescia, Dept Informat Engn, I-25121 Brescia, Italy

[3] Univ Bologna, Dept Comp Sci & Engn, I-40126 Bologna, Italy

来源：

SENSORS | 2023年 / 23卷 / 10期

关键词：

polyp segmentation; computer vision; ensemble; transformers; convolutional neural networks; IMAGES;

D O I：

10.3390/s23104688

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

In the realm of computer vision, semantic segmentation is the task of recognizing objects in images at the pixel level. This is done by performing a classification of each pixel. The task is complex and requires sophisticated skills and knowledge about the context to identify objects' boundaries. The importance of semantic segmentation in many domains is undisputed. In medical diagnostics, it simplifies the early detection of pathologies, thus mitigating the possible consequences. In this work, we provide a review of the literature on deep ensemble learning models for polyp segmentation and develop new ensembles based on convolutional neural networks and transformers. The development of an effective ensemble entails ensuring diversity between its components. To this end, we combined different models (HarDNet-MSEG, Polyp-PVT, and HSNet) trained with different data augmentation techniques, optimization methods, and learning rates, which we experimentally demonstrate to be useful to form a better ensemble. Most importantly, we introduce a new method to obtain the segmentation mask by averaging intermediate masks after the sigmoid layer. In our extensive experimental evaluation, the average performance of the proposed ensembles over five prominent datasets beat any other solution that we know of. Furthermore, the ensembles also performed better than the state-of-the-art on two of the five datasets, when individually considered, without having been specifically trained for them.

引用

页数：19

共 67 条

[1] Learning from Imbalanced Data Sets with Weighted Cross-Entropy Function [J].

Aurelio, Yuri Sousa ;

de Almeida, Gustavo Matheus ;

de Castro, Cristiano Leite ;

Braga, Antonio Padua .

NEURAL PROCESSING LETTERS, 2019, 50 (02) :1937-1949

[2] Eff-UNet: A Novel Architecture for Semantic Segmentation in Unstructured Environment [J].

Baheti, Bhakti ;

Innani, Shubham ;

Gajre, Suhas ;

Talbar, Sanjay .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :1473-1481

[3] Towards automatic polyp detection with a polyp appearance model [J].

Bernal, J. ;

Sanchez, J. ;

Vilarino, F. .

PATTERN RECOGNITION, 2012, 45 (09) :3166-3182

[4] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].

Bernal, Jorge ;

Javier Sanchez, F. ;

Fernandez-Esparrach, Gloria ;

Gil, Debora ;

Rodriguez, Cristina ;

Vilarino, Fernando .

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111

[5]

Chen J., 2021, arXiv

[6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[7]

Cho YJ, 2024, Arxiv, DOI arXiv:2107.09858

[8] Voting with random classifiers (VORACE): theoretical and experimental analysis [J].

Cornelio, Cristina ;

Donini, Michele ;

Loreggia, Andrea ;

Pini, Maria Silvia ;

Rossi, Francesca .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2021, 35 (02)

[9]

Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26

[10]

Dong B, 2024, Arxiv, DOI arXiv:2108.06932

← 1 2 3 4 5 6 7 →