A machine learning PROGRAM to identify COVID-19 and other diseases from hematology data

被引:5
作者
Gladding, Patrick A. [1 ]
Ayar, Zina [2 ]
Smith, Kevin [3 ]
Patel, Prashant [3 ]
Pearce, Julia [3 ]
Puwakdandawa, Shalini [3 ]
Tarrant, Dianne [3 ]
Atkinson, Jon [3 ]
McChlery, Elizabeth [3 ]
Hanna, Merit [4 ]
Gow, Nick [5 ]
Bhally, Hasan [5 ]
Read, Kerry [5 ]
Jayathissa, Prageeth [6 ]
Wallace, Jonathan [6 ]
Norton, Sam [7 ]
Kasabov, Nick [8 ]
Calude, Cristian S. [9 ]
Steel, Deborah [10 ]
Mckenzie, Colin [10 ]
机构
[1] Waitemata Dist Hlth Board, Dept Cardiol, Auckland, New Zealand
[2] Waitemata Dist Hlth Board, Clin Informat Serv, Auckland, New Zealand
[3] Waitemata Dist Hlth Board, Clin Lab, Auckland, New Zealand
[4] Waitemata Dist Hlth Board, Dept Hematol, Auckland, New Zealand
[5] Waitemata Dist Hlth Board, Dept Infect Dis, Auckland, New Zealand
[6] Waitemata Dist Hlth Board, Inst Innovat & Improvement i3, Auckland, New Zealand
[7] Nanix Ltd, Dunedin, New Zealand
[8] Auckland Univ Technol, Knowledge Engn & Discovery Res Inst KEDRI, Auckland, New Zealand
[9] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
[10] Sysmex New Zealand Ltd, Auckland, New Zealand
关键词
biological age; COVID-19; full blood count; heart failure; hematology; machine learning; pneumonia; VALIDATION;
D O I
10.2144/fsoa-2020-0207
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Aim: We propose a method for screening full blood count metadata for evidence of communicable and noncommunicable diseases using machine learning (ML). Materials & methods: High dimensional hematology metadata was extracted over an 11-month period from Sysmex hematology analyzers from 43,761 patients. Predictive models for age, sex and individuality were developed to demonstrate the personalized nature of hematology data. Both numeric and raw flow cytometry data were used for both supervised and unsupervised ML to predict the presence of pneumonia, urinary tract infection and COVID-19. Heart failure was used as an objective to prove method generalizability. Results: Chronological age was predicted by a deep neural network with R-2: 0.59; mean absolute error: 12; sex with AUROC: 0.83, phi: 0.47; individuality with 99.7% accuracy, phi: 0.97; pneumonia with AUROC: 0.74, sensitivity 58%, specificity 79%, 95% CI: 0.73-0.75, p < 0.0001; urinary tract infection AUROC: 0.68, sensitivity 52%, specificity 79%, 95% CI: 0.67-0.68, p < 0.0001; COVID-19 AUROC: 0.8, sensitivity 82%, specificity 75%, 95% CI: 0.79-0.8, p = 0.0006; and heart failure area under the receiver operator curve (AUROC): 0.78, sensitivity 72%, specificity 72%, 95% CI: 0.77-0.78; p < 0.0001. Conclusion: ML applied to hematology data could predict communicable and noncommunicable diseases, both at local and global levels. Lay abstract: Identifying and monitoring an infectious disease within a community requires sampling of both symptomatic and asymptomatic individuals. Hematology data from a full blood count contains significantly more information than used clinically. We applied machine learning to hematology data to predict the presence of both communicable and noncommunicable diseases.
引用
收藏
页数:18
相关论文
共 43 条
[1]  
Anft M., 2020, COVID 19 PROGRESSION, DOI 10.1101/2020.04.28.20083089
[2]  
Batista A.F.M., 2020, COVID-19 diagnosis prediction in emergency care patients: a machine learning approach, DOI 10.1101/2020.04.04.20052092
[3]   Detection of COVID-19 Infection from Routine Blood Exams with Machine Learning: A Feasibility Study [J].
Brinati, Davide ;
Campagner, Andrea ;
Ferrari, Davide ;
Locatelli, Massimo ;
Banfi, Giuseppe ;
Cabitza, Federico .
JOURNAL OF MEDICAL SYSTEMS, 2020, 44 (08)
[4]   Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests [J].
Cabitza, Federico ;
Campagner, Andrea ;
Ferrari, Davide ;
Di Resta, Chiara ;
Ceriotti, Daniele ;
Sabetta, Eleonora ;
Colombini, Alessandra ;
De Vecchi, Elena ;
Banfi, Giuseppe ;
Locatelli, Massimo ;
Carobene, Anna .
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2021, 59 (02) :421-431
[5]  
de MoraesBatistaAF., 2020, 20052092 MEDRXIV
[6]  
Delafiori J., 2020, 20161828 MEDRXIV
[7]   Covid-19-Navigating the Uncharted [J].
Fauci, Anthony S. ;
Lane, H. Clifford ;
Redfield, Robert R. .
NEW ENGLAND JOURNAL OF MEDICINE, 2020, 382 (13) :1268-1269
[8]  
Ge, 2020, INTERPRETABLE MACHIN
[9]  
Gladding PA., 2020, HEART LUNG CIRC, V29, pS24
[10]   Cell Population Data-Driven Acute Promyelocytic Leukemia Flagging Through Artificial Neural Network Predictive Modeling [J].
Haider, Rana Zeeshan ;
Ujjan, Ikram Uddin ;
Shamsi, Tahir S. .
TRANSLATIONAL ONCOLOGY, 2020, 13 (01) :11-16