Automatic Classification Between COVID-19 and Non-COVID-19 Pneumonia Using Symptoms, Comorbidities, and Laboratory Findings: The Khorshid COVID Cohort Study

被引:5
作者
Marateb, Hamid Reza [1 ]
Ziaie Nezhad, Farzad [1 ]
Mohebian, Mohammad Reza [2 ]
Sami, Ramin [3 ]
Haghjooy Javanmard, Shaghayegh [4 ]
Dehghan Niri, Fatemeh [5 ]
Akafzadeh-Savari, Mahsa [6 ]
Mansourian, Marjan [7 ,8 ]
Mananas, Miquel Angel [7 ,9 ]
Wolkewitz, Martin [10 ,11 ]
Binder, Harald [10 ,11 ]
机构
[1] Univ Isfahan, Engn Fac, Biomed Engn Dept, Esfahan, Iran
[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada
[3] Isfahan Univ Med Sci, Sch Med, Dept Internal Med, Esfahan, Iran
[4] Isfahan Univ Med Sci, Cardiovasc Res Inst, Appl Physiol Res Ctr, Dept Physiol,Sch Med, Esfahan, Iran
[5] Isfahan Univ Med Sci, Sch Med, Esfahan, Iran
[6] Isfahan Univ Med Sci, Isfahan Clin Toxicol Res Ctr, Esfahan, Iran
[7] Univ Politecn Catalunya Barcelona Tech UPC, Automat Control Dept ESAII, Biomed Engn Res Ctr CREB, Barcelona, Spain
[8] Isfahan Univ Med Sci, Sch Hlth, Dept Epidemiol & Biostat, Esfahan, Iran
[9] Biomed Res Networking Ctr Bioengn Biomat & Nanome, Madrid, Spain
[10] Univ Freiburg, Inst Med Biometry & Stat, Fac Med, Freiburg, Germany
[11] Univ Freiburg, Inst Med Biometry & Stat, Med Ctr, Freiburg, Germany
基金
欧盟地平线“2020”;
关键词
COVID-19; computer-aided diagnosis; screening; validation studies; machine learning; CORONAVIRUS DISEASE 2019; CLINICAL CHARACTERISTICS; LOGISTIC-REGRESSION; PREDICTION; DIAGNOSIS; SYSTEM; HEALTH; PROGNOSIS; SELECTION; FAMILY;
D O I
10.3389/fmed.2021.768467
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Coronavirus disease-2019, also known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was a disaster in 2020. Accurate and early diagnosis of coronavirus disease-2019 (COVID-19) is still essential for health policymaking. Reverse transcriptase-polymerase chain reaction (RT-PCR) has been performed as the operational gold standard for COVID-19 diagnosis. We aimed to design and implement a reliable COVID-19 diagnosis method to provide the risk of infection using demographics, symptoms and signs, blood markers, and family history of diseases to have excellent agreement with the results obtained by the RT-PCR and CT-scan. Our study primarily used sample data from a 1-year hospital-based prospective COVID-19 open-cohort, the Khorshid COVID Cohort (KCC) study. A sample of 634 patients with COVID-19 and 118 patients with pneumonia with similar characteristics whose RT-PCR and chest CT scan were negative (as the control group) (dataset 1) was used to design the system and for internal validation. Two other online datasets, namely, some symptoms (dataset 2) and blood tests (dataset 3), were also analyzed. A combination of one-hot encoding, stability feature selection, over-sampling, and an ensemble classifier was used. Ten-fold stratified cross-validation was performed. In addition to gender and symptom duration, signs and symptoms, blood biomarkers, and comorbidities were selected. Performance indices of the cross-validated confusion matrix for dataset 1 were as follows: sensitivity of 96% [confidence interval, CI, 95%: 94-98], specificity of 95% [90-99], positive predictive value (PPV) of 99% [98-100], negative predictive value (NPV) of 82% [76-89], diagnostic odds ratio (DOR) of 496 [198-1,245], area under the ROC (AUC) of 0.96 [0.94-0.97], Matthews Correlation Coefficient (MCC) of 0.87 [0.85-0.88], accuracy of 96% [94-98], and Cohen's Kappa of 0.86 [0.81-0.91]. The proposed algorithm showed excellent diagnosis accuracy and class-labeling agreement, and fair discriminant power. The AUC on the datasets 2 and 3 was 0.97 [0.96-0.98] and 0.92 [0.91-0.94], respectively. The most important feature was white blood cell count, shortness of breath, and C-reactive protein for datasets 1, 2, and 3, respectively. The proposed algorithm is, thus, a promising COVID-19 diagnosis method, which could be an amendment to simple blood tests and screening of symptoms. However, the RT-PCR and chest CT-scan, performed as the gold standard, are not 100% accurate.
引用
收藏
页数:14
相关论文
共 82 条
  • [31] ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning
    He, Haibo
    Bai, Yang
    Garcia, Edwardo A.
    Li, Shutao
    [J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1322 - 1328
  • [32] The impact of Covid-19 pandemic on corporate social responsibility and marketing philosophy
    He, Hongwei
    Harris, Lloyd
    [J]. JOURNAL OF BUSINESS RESEARCH, 2020, 116 : 176 - 182
  • [33] Value of chest computed tomography scan in diagnosis of COVID-19; a systematic review and meta-analysis
    Hossein, Hasti
    Ali, Kosar Mohamed
    Hosseini, Mostafa
    Sarveazad, Arash
    Safari, Saeed
    Yousefifard, Mahmoud
    [J]. CLINICAL AND TRANSLATIONAL IMAGING, 2020, 8 (06) : 469 - 481
  • [34] Jernigan DB, 2020, MMWR-MORBID MORTAL W, V69, P216, DOI 10.15585/mmwr.mm6908e1
  • [35] Comparison of Clinical Characteristics Among COVID-19 and Non-COVID-19 Pediatric Pneumonias: A Multicenter Cross-Sectional Study
    Jia, Zhongwei
    Yan, Xiangyu
    Gao, Liwei
    Ding, Shenggang
    Bai, Yan
    Zheng, Yuejie
    Cui, Yuxia
    Wang, Xianfeng
    Li, Jingfeng
    Lu, Gen
    Xu, Yi
    Zhang, Xiangyu
    Li, Junhua
    Chen, Ning
    Shang, Yunxiao
    Han, Mingfeng
    Liu, Jun
    Zhou, Hourong
    Li, Cen
    Lu, Wanqiu
    Liu, Jun
    Wang, Lina
    Fan, Qihong
    Wu, Jiang
    Shen, Hanling
    Jiao, Rong
    Chen, Chunxi
    Gao, Xiaoling
    Tian, Maoqiang
    Lu, Wei
    Yang, Yonghong
    Wong, Gary Wing-Kin
    Wang, Tianyou
    Jin, Runming
    Shen, Adong
    Xu, Baoping
    Shen, Kunling
    [J]. FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY, 2021, 11
  • [36] History in a Crisis - Lessons for Covid-19
    Jones, David S.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2020, 382 (18) : 1681 - 1683
  • [37] Kamalov F., 2021, ARXIV PREPRINT ARXIV
  • [38] Ke GL, 2017, ADV NEUR IN, V30
  • [39] The role of biomarkers in diagnosis of COVID-19-A systematic review
    Kermali, Muhammed
    Khalsa, Raveena Kaur
    Pillai, Kiran
    Ismail, Zahra
    Harky, Amer
    [J]. LIFE SCIENCES, 2020, 254
  • [40] Homeostasis, Inflammation, and Disease Susceptibility
    Kotas, Maya E.
    Medzhitov, Ruslan
    [J]. CELL, 2015, 160 (05) : 816 - 827