Efficient pneumonia detection using Vision Transformers on chest X-rays

被引:25
作者
Singh, Sukhendra [1 ]
Kumar, Manoj [1 ]
Kumar, Abhay [2 ]
Verma, Birendra Kumar [1 ]
Abhishek, Kumar [2 ]
Selvarajan, Shitharth [3 ]
机构
[1] JSS Acad Tech Educ, Noida, India
[2] Natl Inst Technol Patna, Patna, India
[3] Leeds Beckett Univ, Sch Built Environm Engn & Comp, Leeds LS1 3HE, England
关键词
ATTENTION;
D O I
10.1038/s41598-024-52703-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Pneumonia is a widespread and acute respiratory infection that impacts people of all ages. Early detection and treatment of pneumonia are essential for avoiding complications and enhancing clinical results. We can reduce mortality, improve healthcare efficiency, and contribute to the global battle against a disease that has plagued humanity for centuries by devising and deploying effective detection methods. Detecting pneumonia is not only a medical necessity but also a humanitarian imperative and a technological frontier. Chest X-rays are a frequently used imaging modality for diagnosing pneumonia. This paper examines in detail a cutting-edge method for detecting pneumonia implemented on the Vision Transformer (ViT) architecture on a public dataset of chest X-rays available on Kaggle. To acquire global context and spatial relationships from chest X-ray images, the proposed framework deploys the ViT model, which integrates self-attention mechanisms and transformer architecture. According to our experimentation with the proposed Vision Transformer-based framework, it achieves a higher accuracy of 97.61%, sensitivity of 95%, and specificity of 98% in detecting pneumonia from chest X-rays. The ViT model is preferable for capturing global context, comprehending spatial relationships, and processing images that have different resolutions. The framework establishes its efficacy as a robust pneumonia detection solution by surpassing convolutional neural network (CNN) based architectures.
引用
收藏
页数:17
相关论文
共 64 条
[31]  
Li Wei, 2022, SepViT: Separable Vision Transformer
[32]   A transfer learning method with deep residual network for pediatric pneumonia diagnosis [J].
Liang, Gaobo ;
Zheng, Lixin .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2020, 187
[33]   Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].
Liu, Ze ;
Lin, Yutong ;
Cao, Yue ;
Hu, Han ;
Wei, Yixuan ;
Zhang, Zheng ;
Lin, Stephen ;
Guo, Baining .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002
[34]   Reversible Vision Transformers [J].
Mangalam, Karttikeya ;
Fan, Haoqi ;
Li, Yanghao ;
Wu, Chao-Yuan ;
Xiong, Bo ;
Feichtenhofer, Christoph ;
Malik, Jitendra .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :10820-10830
[35]   xViTCOS: Explainable Vision Transformer Based COVID-19 Screening Using Radiography [J].
Mondal, Arnab Kumar ;
Bhattacharjee, Arnab ;
Singla, Parag ;
Prathosh, A. P. .
IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2022, 10
[36]   CO-IRv2: Optimized InceptionResNetV2 for COVID-19 detection from chest CT images [J].
Mondal, M. Rubaiyat Hossain ;
Bharati, Subrato ;
Podder, Prajoy .
PLOS ONE, 2021, 16 (10)
[37]   Automatic classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray image: combination of data augmentation methods [J].
Nishio, Mizuho ;
Noguchi, Shunjiro ;
Matsuo, Hidetoshi ;
Murakami, Takamichi .
SCIENTIFIC REPORTS, 2020, 10 (01)
[38]   Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification [J].
Park, Sangjoon ;
Kim, Gwanghyun ;
Oh, Yujin ;
Seo, Joon Beom ;
Lee, Sang Min ;
Kim, Jin Hwan ;
Moon, Sungjun ;
Lim, Jae-Kwang ;
Ye, Jong Chul .
MEDICAL IMAGE ANALYSIS, 2022, 75
[39]   A transformer-based approach to irony and sarcasm detection [J].
Potamias, Rolandos Alexandros ;
Siolas, Georgios ;
Stafylopatis, Andreas-Georgios .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (23) :17309-17320
[40]  
Ramachandran P, 2019, ADV NEUR IN, V32