Vision transformer distillation for enhanced gastrointestinal abnormality recognition in wireless capsule endoscopy images

被引:0
|
作者
Oukdach, Yassine [1 ]
Garbaz, Anass [1 ]
Kerkaou, Zakaria [1 ]
El Ansari, Mohamed [2 ]
Koutti, Lahcen [1 ]
Papachrysos, Nikolaos [3 ,4 ]
El Ouafdi, Ahmed Fouad [1 ]
de Lange, Thomas [3 ,4 ]
Distante, Cosimo [5 ]
机构
[1] Ibn Zohr Univ, Fac Sci, Dept Comp Sci, LabSIV, Agadir, Morocco
[2] Moulay Ismail Univ, Fac Sci, Dept Comp Sci, Informat & Applicat Lab, Meknes, Morocco
[3] Univ Gothenburg, Sahlgrenska Acad, Dept Mol & Clin Med, Gothenburg, Sweden
[4] Sahlgrens Univ Hosp, Med Dept, Molndal, Sweden
[5] CNR, Inst Appl Sci & Intelligent Syst Eduardo Caianiell, Lecce, Italy
关键词
wireless capsule endoscopy; vision transformer; convolutional neural network; attention mechanism; knowledge distillation; gastrointestinal abnormality detection; CANCER STATISTICS; SYSTEM; COLON;
D O I
10.1117/1.JMI.12.1.014505
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose: Wireless capsule endoscopy (WCE) is a non-invasive technology used for diagnosing gastrointestinal abnormalities. A single examination generates similar to 55,000 images, making manual review both time-consuming and costly for doctors. Therefore, the development of computer vision-assisted systems is highly desirable to aid in the diagnostic process. Approach: We presents a deep learning approach leveraging knowledge distillation (KD) from a convolutional neural network (CNN) teacher model to a vision transformer (ViT) student model for gastrointestinal abnormality recognition. The CNN teacher model utilizes attention mechanisms and depth-wise separable convolutions to extract features from WCE images, supervising the ViT in learning these representations. Results: The proposed method achieves accuracy of 97% and 96% on the Kvasir and KID datasets, respectively, demonstrating its effectiveness in distinguishing normal from abnormal regions and bleeding from non-bleeding cases. The proposed approach offers computational efficiency and generalization to unseen datasets, outperforming several state-of-the-art methods. Conclusions: We proposed a deep learning approach utilizing CNNs and a ViT with KD to effectively classify gastrointestinal diseases in WCE images. It demonstrates promising performance on public datasets, distinguishing normal from abnormal regions and bleeding from non-bleeding cases while offering optimal computational efficiency compared with existing methods, making it suitable for GI disease applications.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Automatic Bleeding Frame Detection in the Wireless Capsule Endoscopy Images
    Yuan, Yixuan
    Meng, Max Q-H
    2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 1310 - 1315
  • [22] A Novel Feature for Polyp Detection in Wireless Capsule Endoscopy images
    Yuan, Yixuan
    Meng, Max Q. -H.
    2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014), 2014, : 5010 - 5015
  • [23] Detecting Mucosal Abnormalities from Wireless Capsule Endoscopy Images
    Abiko, Aschalew Tirulo
    Vala, Brijesh
    Patel, Satvik
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 872 - 878
  • [24] A deep CNN model for anomaly detection and localization in wireless capsule endoscopy images
    Jain, Samir
    Seal, Ayan
    Ojha, Aparajita
    Yazidi, Anis
    Bures, Jan
    Tacheci, Ilja
    Krejcar, Ondrej
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 137
  • [25] Convolution-Enhanced Vision Transformer Network for Smoke Recognition
    Cheng, Guangtao
    Zhou, Yancong
    Gao, Shan
    Li, Yingyu
    Yu, Hao
    FIRE TECHNOLOGY, 2023, 59 (02) : 925 - 948
  • [26] Convolution-Enhanced Vision Transformer Network for Smoke Recognition
    Guangtao Cheng
    Yancong Zhou
    Shan Gao
    Yingyu Li
    Hao Yu
    Fire Technology, 2023, 59 : 925 - 948
  • [27] Gastrointestinal Tract Bleeding Detection from Wireless Capsule Endoscopy Videos
    Charfi, Said
    El Ansari, Mohamed
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, DATA AND CLOUD COMPUTING (ICC 2017), 2017,
  • [28] Vision Transformer With Hybrid Shifted Windows for Gastrointestinal Endoscopy Image Classification
    Wang, Wei
    Yang, Xin
    Tang, Jinhui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4452 - 4461
  • [29] FEATURE SPACE EXTRAPOLATION FOR ULCER CLASSIFICATION IN WIRELESS CAPSULE ENDOSCOPY IMAGES
    Lee, Changhoo
    Min, Junki
    Cha, Jaemyung
    Lee, Seungkyu
    2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 100 - 103
  • [30] Assessment of Crohn's Disease Lesions in Wireless Capsule Endoscopy Images
    Kumar, Rajesh
    Zhao, Qian
    Seshamani, Sharmishtaa
    Mullin, Gerard
    Hager, Gregory
    Dassopoulos, Themistocles
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2012, 59 (02) : 355 - 362