Empirical Study of Attention-Based Models for Automatic Classification of Gastrointestinal Endoscopy Images

被引：0

作者：

Espantaleon-Perez, Ricardo ^{[1
]}

Jimenez-Velasco, Isabel ^{[1
]}

Munoz-Salinas, Rafael ^{[1
,2
]}

Marin-Jimenez, Manuel J. ^{[1
,2
]}

机构：

[1] Univ Cordoba, Dept Comp & Numer Anal, Cordoba, Spain

[2] Maimonides Inst Biomed Res Cordoba IMIBIC, Cordoba, Spain

来源：

COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT II | 2023年 / 14185卷

关键词：

Attention; Transformers; Endoscopy; Medical Image;

D O I：

10.1007/978-3-031-44240-7_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic and accurate analysis of medical images is a subject of great importance in our current society. In particular, this work focuses on gastrointestinal endoscopy images, as the study of these images helps to detect possible health conditions in those regions. Published works on this topic mainly used traditional classification methods (e.g., Support VectorMachines) or more modern techniques, such as Convolutional Neural Networks. However, little attention has been paid to more recent approaches such as Transformers or, in general, Attention-based Deep Neural Networks. This work aims to evaluate the performance of state-of-the-art attention-based models on the problem of classification of gastrointestinal endoscopy images. The experimental results on the challenging Hyper-Kvasir dataset indicate that attention-based models achieve performance equal to or better than that obtained by previous models, needing fewer parameters. In addition, a new state of the art on Hyper-Kvasir (i.e., 0.636 F1-Macro) is obtained by the fusion of two MobileViT models with only 20M parameters. The source code will be published here: https://github.com/richardesp/Attention-based-models-for-Hyper-Kvasir/.

引用

页码：98 / 108

页数：11

共 19 条

[1] HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy
Borgli, Hanna
Thambawita, Vajira
Smedsrud, Pia H.
Hicks, Steven
Jha, Debesh
Eskeland, Sigrun L.
Randel, Kristin Ranheim
Pogorelov, Konstantin
Lux, Mathias
Nguyen, Duc Tien Dang
Johansen, Dag
Griwodz, Carsten
Stensland, Hakon K.
Garcia-Ceja, Enrique
Schmidt, Peter T.
Hammer, Hugo L.
Riegler, Michael A.
Halvorsen, Pal
de Lange, Thomas
[J]. SCIENTIFIC DATA, 2020, 7 (01)
[2] Accurate Classification of Diminutive Colorectal Polyps Using Computer-Aided Analysis
Chen, Peng-Jen
Lin, Meng-Chiung
Lai, Mei-Ju
Lin, Jung-Chun
Lu, Henry Horng-Shing
Tseng, Vincent S.
[J]. GASTROENTEROLOGY, 2018, 154 (03) : 568 - 575
[3] Dai Zihang, 2021, NEURIPS, V34
[4] DaViT: Dual Attention Vision Transformers
Ding, Mingyu
Xiao, Bin
Codella, Noel
Luo, Ping
Wang, Jingdong
Yuan, Lu
[J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 74 - 92
[5] docs.opencv, Structural analysis and shape descriptors
[6] Dosovitskiy A., 2021, INT C LEARN REPR 202, P1
[7] Galdran A., 2021, ICPR WORKSH CHALL
[8] Balanced-MixUp for Highly Imbalanced Medical Image Classification
Galdran, Adrian
Carneiro, Gustavo
Gonzalez Ballester, Miguel A.
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 323 - 333
[9] A self-learning teacher-student framework for gastrointestinal image classification
Gjestang, Henrik L.
Hicks, Steven A.
Thambawita, Vajira
Halvorsen, Pal
Riegler, Michael A.
[J]. 2021 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2021, : 539 - 544
[10] CMT: Convolutional Neural Networks Meet Vision Transformers
Guo, Jianyuan
Han, Kai
Wu, Han
Tang, Yehui
Chen, Xinghao
Wang, Yunhe
Xu, Chang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12165 - 12175

← 1 2 →