Empirical Study of Attention-Based Models for Automatic Classification of Gastrointestinal Endoscopy Images

被引:0
作者
Espantaleon-Perez, Ricardo [1 ]
Jimenez-Velasco, Isabel [1 ]
Munoz-Salinas, Rafael [1 ,2 ]
Marin-Jimenez, Manuel J. [1 ,2 ]
机构
[1] Univ Cordoba, Dept Comp & Numer Anal, Cordoba, Spain
[2] Maimonides Inst Biomed Res Cordoba IMIBIC, Cordoba, Spain
来源
COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT II | 2023年 / 14185卷
关键词
Attention; Transformers; Endoscopy; Medical Image;
D O I
10.1007/978-3-031-44240-7_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic and accurate analysis of medical images is a subject of great importance in our current society. In particular, this work focuses on gastrointestinal endoscopy images, as the study of these images helps to detect possible health conditions in those regions. Published works on this topic mainly used traditional classification methods (e.g., Support VectorMachines) or more modern techniques, such as Convolutional Neural Networks. However, little attention has been paid to more recent approaches such as Transformers or, in general, Attention-based Deep Neural Networks. This work aims to evaluate the performance of state-of-the-art attention-based models on the problem of classification of gastrointestinal endoscopy images. The experimental results on the challenging Hyper-Kvasir dataset indicate that attention-based models achieve performance equal to or better than that obtained by previous models, needing fewer parameters. In addition, a new state of the art on Hyper-Kvasir (i.e., 0.636 F1-Macro) is obtained by the fusion of two MobileViT models with only 20M parameters. The source code will be published here: https://github.com/richardesp/Attention-based-models-for-Hyper-Kvasir/.
引用
收藏
页码:98 / 108
页数:11
相关论文
共 19 条
  • [1] HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy
    Borgli, Hanna
    Thambawita, Vajira
    Smedsrud, Pia H.
    Hicks, Steven
    Jha, Debesh
    Eskeland, Sigrun L.
    Randel, Kristin Ranheim
    Pogorelov, Konstantin
    Lux, Mathias
    Nguyen, Duc Tien Dang
    Johansen, Dag
    Griwodz, Carsten
    Stensland, Hakon K.
    Garcia-Ceja, Enrique
    Schmidt, Peter T.
    Hammer, Hugo L.
    Riegler, Michael A.
    Halvorsen, Pal
    de Lange, Thomas
    [J]. SCIENTIFIC DATA, 2020, 7 (01)
  • [2] Accurate Classification of Diminutive Colorectal Polyps Using Computer-Aided Analysis
    Chen, Peng-Jen
    Lin, Meng-Chiung
    Lai, Mei-Ju
    Lin, Jung-Chun
    Lu, Henry Horng-Shing
    Tseng, Vincent S.
    [J]. GASTROENTEROLOGY, 2018, 154 (03) : 568 - 575
  • [3] Dai Zihang, 2021, NEURIPS, V34
  • [4] DaViT: Dual Attention Vision Transformers
    Ding, Mingyu
    Xiao, Bin
    Codella, Noel
    Luo, Ping
    Wang, Jingdong
    Yuan, Lu
    [J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 74 - 92
  • [5] docs.opencv, Structural analysis and shape descriptors
  • [6] Dosovitskiy A., 2021, INT C LEARN REPR 202, P1
  • [7] Galdran A., 2021, ICPR WORKSH CHALL
  • [8] Balanced-MixUp for Highly Imbalanced Medical Image Classification
    Galdran, Adrian
    Carneiro, Gustavo
    Gonzalez Ballester, Miguel A.
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 323 - 333
  • [9] A self-learning teacher-student framework for gastrointestinal image classification
    Gjestang, Henrik L.
    Hicks, Steven A.
    Thambawita, Vajira
    Halvorsen, Pal
    Riegler, Michael A.
    [J]. 2021 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2021, : 539 - 544
  • [10] CMT: Convolutional Neural Networks Meet Vision Transformers
    Guo, Jianyuan
    Han, Kai
    Wu, Han
    Tang, Yehui
    Chen, Xinghao
    Wang, Yunhe
    Xu, Chang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12165 - 12175