Efficient pneumonia detection using Vision Transformers on chest X-rays

被引：25

作者：

Singh, Sukhendra ^{[1
]}

Kumar, Manoj ^{[1
]}

Kumar, Abhay ^{[2
]}

Verma, Birendra Kumar ^{[1
]}

Abhishek, Kumar ^{[2
]}

Selvarajan, Shitharth ^{[3
]}

机构：

[1] JSS Acad Tech Educ, Noida, India

[2] Natl Inst Technol Patna, Patna, India

[3] Leeds Beckett Univ, Sch Built Environm Engn & Comp, Leeds LS1 3HE, England

来源：

SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期

关键词：

ATTENTION;

D O I：

10.1038/s41598-024-52703-2

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Pneumonia is a widespread and acute respiratory infection that impacts people of all ages. Early detection and treatment of pneumonia are essential for avoiding complications and enhancing clinical results. We can reduce mortality, improve healthcare efficiency, and contribute to the global battle against a disease that has plagued humanity for centuries by devising and deploying effective detection methods. Detecting pneumonia is not only a medical necessity but also a humanitarian imperative and a technological frontier. Chest X-rays are a frequently used imaging modality for diagnosing pneumonia. This paper examines in detail a cutting-edge method for detecting pneumonia implemented on the Vision Transformer (ViT) architecture on a public dataset of chest X-rays available on Kaggle. To acquire global context and spatial relationships from chest X-ray images, the proposed framework deploys the ViT model, which integrates self-attention mechanisms and transformer architecture. According to our experimentation with the proposed Vision Transformer-based framework, it achieves a higher accuracy of 97.61%, sensitivity of 95%, and specificity of 98% in detecting pneumonia from chest X-rays. The ViT model is preferable for capturing global context, comprehending spatial relationships, and processing images that have different resolutions. The framework establishes its efficacy as a robust pneumonia detection solution by surpassing convolutional neural network (CNN) based architectures.

引用

页数：17

共 64 条

[1]

Adhinata FD, 2021, JUITA Jurnal Informatika, V9, P115, DOI [10.30595/juita.v9i1.9624, 10.30595/juita.v9i1.9624, DOI 10.30595/JUITA.V9I1.9624]

[2]

Akbari H, 2021, ADV NEUR IN

[3] AI-driven deep CNN approach for multi-label pathology classification using chest X-Rays [J].

Albahli, Saleh ;

Rauf, Hafiz Tayyab ;

Algosaibi, Abdulelah ;

Balas, Valentina Emilia .

PEERJ COMPUTER SCIENCE, 2021, 7 :1-17

[4]

[Anonymous], 2019, Pneumonia in children

[5] A deep learning-based framework for detecting COVID-19 patients using chest X-rays [J].

Asif, Sohaib ;

Zhao, Ming ;

Tang, Fengxiao ;

Zhu, Yusen .

MULTIMEDIA SYSTEMS, 2022, 28 (04) :1495-1513

[6] Channel Attention Networks [J].

Bastidas, Alexei A. ;

Tang, Hanlin .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :881-888

[7] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[8] Emerging Properties in Self-Supervised Vision Transformers [J].

Caron, Mathilde ;

Touvron, Hugo ;

Misra, Ishan ;

Jegou, Herve ;

Mairal, Julien ;

Bojanowski, Piotr ;

Joulin, Armand .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9630-9640

[9] Learning Spatial Attention for Face Super-Resolution [J].

Chen, Chaofeng ;

Gong, Dihong ;

Wang, Hao ;

Li, Zhifeng ;

Wong, Kwan-Yee K. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :1219-1231

[10] Channel and spatial attention based deep object co-segmentation [J].

Chen, Jia ;

Chen, Yasong ;

Li, Weihao ;

Ning, Guoqin ;

Tong, Mingwen ;

Hilton, Adrian .

KNOWLEDGE-BASED SYSTEMS, 2021, 211

← 1 2 3 4 5 6 7 →