Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images

被引:0
作者
Ali, Hussein Ahmed [1 ,2 ]
Hariri, Walid [3 ]
Zghal, Nadia Smaoui [4 ]
Ben Aissa, Dalenda [1 ]
机构
[1] Univ Tunis El Manar, Fac Sci Tunis, Microwave Elect Res Lab, Tunis El Manar, Tunisia
[2] Univ Kirkuk, Coll Comp Sci & Informat Technol, Kirkuk, Iraq
[3] Badji Mokhtar Annaba Univ, Dept Comp Sci, LabGED Lab, Annaba, Algeria
[4] Univ Sfax, Control & Energy Management Lab, CEM Lab ENIS, Sfax, Tunisia
关键词
Chest X-ray (CXR); COVID-19; Decision tree; Gaussian Na & iuml; ve; Stochastic gradient descent; Bayes; Machine learning; DIAGNOSIS;
D O I
10.21123/bsj.2024.9422
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The rapid spread of the COVID-19 pandemic has strained global healthcare systems, necessitating efficient diagnostic methods. While Polymerase Chain Reaction (PCR) and antigen tests are common, they have limitations in speed and precision. Enhancing the accuracy of imaging techniques, especially Chest X-rays (CXR) and Computerized Tomography (CT) scans, is crucial for detecting COVID-19-related lung abnormalities. CXR, being cost-effective and accessible, is preferred over CT scans, but accurate diagnosis often requires technological support. To address this, an extensive dataset of CXR images categorized into five classes is available on Kaggle. Processing such data involves steps like grayscale conversion, image intensity adjustment, resizing, and feature extraction using Principal Component Analysis (PCA). Machine Learning (ML) techniques, including Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Gaussian Naive Bayes (GNB), and K-Nearest Neighbors (KNN), are employed for image classification. DT shows the highest accuracy at 88%, outperforming other models like GNB (77%), KNN (71%), SGD (70%), LR (74%), and RF (45%). It consistently excels across assessment metrics such as F1-score, sensitivity, and precision, with an 88% best-weighted average. However, selecting the optimal ML model depends on factors like dataset characteristics and implementation specifics. Thus, careful consideration of these factors is crucial when choosing an ML model for COVID-19 diagnosis via CXR image classification.
引用
收藏
页码:687 / 705
页数:20
相关论文
共 61 条
  • [61] Zhang ML, 2005, 2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, P718