A Fusion Deep Learning Model of ResNet and Vision Transformer for 3D CT Images

被引：1

作者：

Liu, Chiyu ^{[1
,2
]}

Sun, Cunjie ^{[1
,3
]}

机构：

[1] Xuzhou Med Univ, Dept Med Imaging, Xuzhou 221004, Peoples R China

[2] First Peoples Hosp Xuzhou, Imaging Ctr, Xuzhou 221002, Peoples R China

[3] Xuzhou Med Univ, Affiliated Hosp, Informat Dept, Xuzhou 221006, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Deep learning; fusion model; 3D CT images; COVID-19; Resnet; 3D; video swin transformer;

D O I：

10.1109/ACCESS.2024.3423689

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The outbreak of COVID-19 has had a serious impact on the safety of human life and property. Rapid and effective diagnosis is the key to the prevention and treatment of the virus. In this study, we introduce a new fusion model called "Reswin", which was trained by 3D CT data to diagnose COVID-19. The model combines two mainstream computer vision models, Resnet 3D (a convolutional neural network) and Video Swin Transformer (a vision transformer neural network), which use a soft voting method. We compared our proposed model Reswin with ResNet 3D-50, Swin-T, MViT, R2+1 D-50, SlowFast-50, X3D, and CSN101, which are state-of-the-art deep learning models used for the classification of 3D images. The Reswin model achieved an accuracy of 0.9099, precision of 0.9266, F1 score of 0.9425, AUC of 0.9541, and AUPR of 0.9861 in binary classification, and an accuracy of 0.8655, precision of 0.8580, and F1 score of 0.8620 in triple classification. Reswin provides a new solution for 3D CT image classification tasks and new ideas for the development of deep learning in 3D medical imaging.

引用

页码：93389 / 93397

页数：9

共 36 条

[1]

Baldi P., 2013, Advances in neural information processing systems, P2814

[2]

Cascella M., 2023, Features, evaluation, and treatment of coronavirus (COVID-19)

[3]

Chen SH, 2019, Arxiv, DOI arXiv:1904.00625

[4] Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR (Publication with Expression of Concern) [J].

Corman, Victor M. ;

Landt, Olfert ;

Kaiser, Marco ;

Molenkamp, Richard ;

Meijer, Adam ;

Chu, Daniel K. W. ;

Bleicker, Tobias ;

Bruenink, Sebastian ;

Schneider, Julia ;

Schmidt, Marie Luisa ;

Mulders, Daphne G. J. C. ;

Haagmans, Bart L. ;

van der Veer, Bas ;

van den Brink, Sharon ;

Wijsman, Lisa ;

Goderski, Gabriel ;

Romette, Jean-Louis ;

Ellis, Joanna ;

Zambon, Maria ;

Peiris, Malik ;

Goossens, Herman ;

Reusken, Chantal ;

Koopmans, Marion P. G. ;

Drosten, Christian .

EUROSURVEILLANCE, 2020, 25 (03) :23-30

[5]

Edelman BL, 2022, PR MACH LEARN RES

[6] Multiscale Vision Transformers [J].

Fan, Haoqi ;

Xiong, Bo ;

Mangalam, Karttikeya ;

Li, Yanghao ;

Yan, Zhicheng ;

Malik, Jitendra ;

Feichtenhofer, Christoph .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :6804-6815

[7] X3D: Expanding Architectures for Efficient Video Recognition [J].

Feichtenhofer, Christoph .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :200-210

[8] SlowFast Networks for Video Recognition [J].

Feichtenhofer, Christoph ;

Fan, Haoqi ;

Malik, Jitendra ;

He, Kaiming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6201-6210

[9] Automatic Sequence-Based Network for Lung Diseases Detection in Chest CT [J].

Hao, Jinkui ;

Xie, Jianyang ;

Liu, Ri ;

Hao, Huaying ;

Ma, Yuhui ;

Yan, Kun ;

Liu, Ruirui ;

Zheng, Yalin ;

Zheng, Jianjun ;

Liu, Jiang ;

Zhang, Jingfeng ;

Zhao, Yitian .

FRONTIERS IN ONCOLOGY, 2021, 11

[10] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [J].

Hara, Kensho ;

Kataoka, Hirokatsu ;

Satoh, Yutaka .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6546-6555

← 1 2 3 4 →