Distilling Knowledge From an Ensemble of Vision Transformers for Improved Classification of Breast Ultrasound

被引:4
|
作者
Zhou, George [1 ]
Mosadegh, Bobak [2 ]
机构
[1] Weill Cornell Med, New York, NY 10021 USA
[2] Weill Cornell Med, Dalio Inst Cardiovasc Imaging, Dept Radiol, New York, NY USA
关键词
Breast ultrasound; Deep learning; Vision transformer; Ensemble learning; Knowledge distillation; NEURAL-NETWORK; CANCER; MAMMOGRAPHY;
D O I
10.1016/j.acra.2023.08.006
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Rationale and Objectives: To develop a deep learning model for the automated classification of breast ultrasound images as benign or malignant. More specifically, the application of vision transformers, ensemble learning, and knowledge distillation is explored for breast ultrasound classification. Materials and Methods: Single view, B-mode ultrasound images were curated from the publicly available Breast Ultrasound Image (BUSI) dataset, which has categorical ground truth labels (benign vs malignant) assigned by radiologists and malignant cases confirmed by biopsy. The performance of vision transformers (ViT) is compared to convolutional neural networks (CNN), followed by a comparison between supervised, self-supervised, and randomly initialized ViT. Subsequently, the ensemble of 10 independently trained ViT, where the ensemble model is the unweighted average of the output of each individual model is compared to the performance of each ViT alone. Finally, we train a single ViT to emulate the ensembled ViT using knowledge distillation. Results: On this dataset that was trained using five-fold cross validation, ViT outperforms CNN, while self-supervised ViT outperform supervised and randomly initialized ViT. The ensemble model achieves an area under the receiver operating characteristics curve (AuROC) and area under the precision recall curve (AuPRC) of 0.977 and 0.965 on the test set, outperforming the average AuROC and AuPRC of the independently trained ViTs (0.958 +/- 0.05 and 0.931 +/- 0.016). The distilled ViT achieves an AuROC and AuPRC of 0.972 and 0.960. Conclusion: Both transfer learning and ensemble learning can each offer increased performance independently and can be sequentially combined to collectively improve the performance of the final model. Furthermore, a single vision transformer can be trained to match the performance of an ensemble of a set of vision transformers using knowledge distillation.
引用
收藏
页码:104 / 120
页数:17
相关论文
共 50 条
  • [1] Distilling efficient Vision Transformers from CNNs for semantic segmentation
    Zheng, Xu
    Luo, Yunhao
    Zhou, Pengyuan
    Wang, Lin
    PATTERN RECOGNITION, 2025, 158
  • [2] Distilling Knowledge from an Ensemble of Models for Punctuation Prediction
    Yi, Jiangyan
    Tao, Jianhua
    Wen, Zhengqi
    Li, Ya
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2779 - 2783
  • [3] BUViTNet: Breast Ultrasound Detection via Vision Transformers
    Ayana, Gelan
    Choe, Se-Woon
    DIAGNOSTICS, 2022, 12 (11)
  • [4] Vision Transformers-Based Transfer Learning for Breast Mass Classification From Multiple Diagnostic Modalities
    Ayana, Gelan
    Choe, Se-woon
    JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2024, 19 (05) : 3391 - 3410
  • [5] Vision Transformers for Breast Cancer Histology Image Classification
    Baroni, Giulia L.
    Rasotto, Laura
    Roitero, Kevin
    Siraj, Ameer Hamza
    Della Mea, Vincenzo
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2023 WORKSHOPS, PT II, 2024, 14366 : 15 - 26
  • [6] Distilling Knowledge From Object Classification to Aesthetics Assessment
    Hou, Jingwen
    Ding, Henghui
    Lin, Weisi
    Liu, Weide
    Fang, Yuming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7386 - 7402
  • [7] Vision Transformers, Ensemble Model, and Transfer Learning Leveraging Explainable AI for Brain Tumor Detection and Classification
    Hossain, Shahriar
    Chakrabarty, Amitabha
    Gadekallu, Thippa Reddy
    Alazab, Mamoun
    Piran, Md. Jalil
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (03) : 1261 - 1272
  • [8] Optimizing Vision Transformers for Histopathology: Pretraining and Normalization in Breast Cancer Classification
    Baroni, Giulia Lucrezia
    Rasotto, Laura
    Roitero, Kevin
    Tulisso, Angelica
    Di Loreto, Carla
    Della Mea, Vincenzo
    JOURNAL OF IMAGING, 2024, 10 (05)
  • [9] PEDNet: A Plain and Efficient Knowledge Distillation Network for Breast Tumor Ultrasound Image Classification
    Liu, Tongtong
    Wang, Yiru
    Wang, Wenhang
    Yang, Mengyao
    Zhang, Lan
    Zhang, Ge
    Dang, Hao
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14866 : 404 - 415
  • [10] An ensemble learning integration of multiple CNN with improved vision transformer models for pest classification
    Xia, Wanshang
    Han, Dezhi
    Li, Dun
    Wu, Zhongdai
    Han, Bing
    Wang, Junxiang
    ANNALS OF APPLIED BIOLOGY, 2023, 182 (02) : 144 - 158