Analyzing Transfer Learning of Vision Transformers for Interpreting Chest Radiography

被引：48

作者：

Usman, Mohammad ^{[1
]}

Zia, Tehseen ^{[1
,2
]}

Tariq, Ali ^{[1
]}

机构：

[1] COMSATS Univ Islamabad CUI, Dept Comp Sci, Islamabad, Pakistan

[2] Natl Ctr Artificial Intelligence, Med Imaging & Diagnost Ctr, Islamabad, Pakistan

来源：

JOURNAL OF DIGITAL IMAGING | 2022年 / 35卷 / 06期

关键词：

Vision transformer; Chest X-rays; Transfer learning; Classification; DIABETIC-RETINOPATHY; VALIDATION; DATASET;

D O I：

10.1007/s10278-022-00666-z

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Limited availability of medical imaging datasets is a vital limitation when using "data hungry" deep learning to gain performance improvements. Dealing with the issue, transfer learning has become a de facto standard, where a pre-trained convolution neural network (CNN), typically on natural images (e.g., ImageNet), is finetuned on medical images. Meanwhile, pre-trained transformers, which are self-attention-based models, have become de facto standard in natural language processing (NLP) and state of the art in image classification due to their powerful transfer learning abilities. Inspired by the success of transformers in NLP and image classification, large-scale transformers (such as vision transformer) are trained on natural images. Based on these recent developments, this research aims to explore the efficacy of pre-trained natural image transformers for medical images. Specifically, we analyze pre-trained vision transformer on CheXpert and pediatric pneumonia dataset. We use CNN standard models including VGGNet and ResNet as baseline models. By examining the acquired representations and results, we discover that transfer learning from the pre-trained vision transformer shows improved results as compared to pre-trained CNN which demonstrates a greater transfer ability of the transformers in medical imaging.

引用

页码：1445 / 1462

页数：18

共 69 条

[21]

Geirhos R., 2018, ImageNet-trained CNNs are biased towards texture

[22]

increasing shape bias improves accuracy and robustness

[23]

Ghassemi N, 2021, ARXIV PREPRINT ARXIV

[24] Automatic lung nodule detection using a 3D deep convolutional neural network combined with a multi-scale prediction strategy in chest CTs [J].

Gu, Yu ;

Lu, Xiaoqi ;

Yang, Lidong ;

Zhang, Baohua ;

Yu, Dahua ;

Zhao, Ying ;

Gao, Lixin ;

Wu, Liang ;

Zhou, Tao .

COMPUTERS IN BIOLOGY AND MEDICINE, 2018, 103 :220-231

[25] Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs [J].

Gulshan, Varun ;

Peng, Lily ;

Coram, Marc ;

Stumpe, Martin C. ;

Wu, Derek ;

Narayanaswamy, Arunachalam ;

Venugopalan, Subhashini ;

Widner, Kasumi ;

Madams, Tom ;

Cuadros, Jorge ;

Kim, Ramasamy ;

Raman, Rajiv ;

Nelson, Philip C. ;

Mega, Jessica L. ;

Webster, R. .

JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 316 (22) :2402-2410

[26] Rethinking ImageNet Pre-training [J].

He, Kaiming ;

Girshick, Ross ;

Dollar, Piotr .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4917-4926

[27] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[28]

Huang G., 2016, P IEEE C COMP VIS PA, P2261, DOI DOI 10.1109/CVPR.2017.243

[29]

Huang Kexin, 2019, Clinicalbert: Modeling clinical notes and predicting hospital readmission

[30]

Irvin J, 2019, AAAI CONF ARTIF INTE, P590

← 1 2 3 4 5 6 7 →