Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification

被引:20
|
作者
Almalik, Faris [1 ]
Yaqub, Mohammad [1 ]
Nandakumar, Karthik [1 ]
机构
[1] Mohamed Bin Zayed Univ, Artificial Intelligence, Abu Dhabi, U Arab Emirates
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III | 2022年 / 13433卷
关键词
Adversarial attack; Vision transformer; Self-ensemble;
D O I
10.1007/978-3-031-16437-8_36
中图分类号
R445 [影像诊断学];
学科分类号
100207 ;
摘要
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and segmentation. While the vulnerability of CNNs to adversarial attacks is a well-known problem, recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. The vulnerability of ViTs to carefully engineered adversarial samples raises serious concerns about their safety in clinical settings. In this paper, we propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks. The proposed Self-Ensembling Vision Transformer (SEViT) leverages the fact that feature representations learned by initial blocks of a ViT are relatively unaffected by adversarial perturbations. Learning multiple classifiers based on these intermediate feature representations and combining these predictions with that of the final ViT classifier can provide robustness against adversarial attacks. Measuring the consistency between the various predictions can also help detect adversarial samples. Experiments on two modalities (chest X-ray and fundoscopy) demonstrate the efficacy of SEViT architecture to defend against various adversarial attacks in the gray-box (attacker has full knowledge of the target model, but not the defense mechanism) setting. Code: https://github.com/faresmalik/SEViT
引用
收藏
页码:376 / 386
页数:11
相关论文
共 50 条
  • [11] Privacy-Preserving Image Classification Using Vision Transformer
    Qi, Zheng
    MaungMaung, AprilPyone
    Kinoshita, Yuma
    Kiya, Hitoshi
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
  • [12] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
    Shiri, Mohammad
    Reddy, Monalika Padma
    Sun, Jiangwen
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301
  • [13] DUAL TRANSFORMER ENCODER MODEL FOR MEDICAL IMAGE CLASSIFICATION
    Yan, Fangyuan
    Yan, Bin
    Pei, Mingtao
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 690 - 694
  • [14] Vision Transformer with window sequence merging mechanism for image classification
    Jiao, Erjie
    Leng, Qiangkui
    Guo, Jiamei
    Meng, Xiangfu
    Wang, Changzhong
    APPLIED SOFT COMPUTING, 2025, 171
  • [15] FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification
    Yoo, Dayeon
    Kim, Jeesu
    Yoo, Jinwoo
    IEEE ACCESS, 2024, 12 : 72598 - 72606
  • [16] Res-MGCA-SE: a lightweight convolutional neural network based on vision transformer for medical image classification
    Soleimani-Fard S.
    Ko S.-B.
    Neural Computing and Applications, 2024, 36 (28) : 17631 - 17644
  • [17] FishAI: Automated hierarchical marine fish image classification with vision transformer
    Yang, Chenghan
    Zhou, Peng
    Wang, Chun-Sheng
    Fu, Ge-Yi
    Xu, Xue-Wei
    Niu, Zhibin
    Zhu, Lin
    Yuan, Ye
    Shen, Hong-Bin
    Pan, Xiaoyong
    ENGINEERING REPORTS, 2024, 6 (12)
  • [18] REVIEW OF VISION TRANSFORMER MODELS FOR REMOTE SENSING IMAGE SCENE CLASSIFICATION
    Lv, Pengyuan
    Wu, Wenjun
    Zhong, Yanfei
    Zhang, Liangpei
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2231 - 2234
  • [19] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    REMOTE SENSING, 2023, 15 (21)
  • [20] Vision Transformer With Hybrid Shifted Windows for Gastrointestinal Endoscopy Image Classification
    Wang, Wei
    Yang, Xin
    Tang, Jinhui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4452 - 4461