An Interpretable Machine Learning Framework for Rare Disease: A Case Study to Stratify Infection Risk in Pediatric Leukemia

被引:6
作者
Al-Hussaini, Irfan [1 ,2 ,3 ]
White, Brandon [1 ,2 ,4 ]
Varmeziar, Armon [1 ,2 ,4 ]
Mehra, Nidhi [1 ,2 ,4 ]
Sanchez, Milagro [1 ,2 ,4 ]
Lee, Judy [5 ]
Degroote, Nicholas P. [5 ]
Miller, Tamara P. [5 ,6 ]
Mitchell, Cassie S. [1 ,2 ,4 ,7 ]
机构
[1] Georgia Inst Technol, Lab Pathol Dynam, Atlanta, GA 30332 USA
[2] Emory Univ, Atlanta, GA 30332 USA
[3] Georgia Inst Technol, Dept Elect & Comp Engn, Atlanta, GA 30332 USA
[4] Georgia Inst Technol, Dept Biomed Engn, Atlanta, GA 30332 USA
[5] Childrens Healthcare Atlanta, Aflac Canc & Blood Disorders Ctr, Atlanta, GA 30322 USA
[6] Emory Univ, Dept Pediat, Div Pediat Hematol Oncol, Atlanta, GA USA
[7] Georgia Inst Technol, Machine Learning Ctr, Georgia Tech, Atlanta, GA 30332 USA
关键词
pediatric leukemia; infection; artificial intelligence; machine learning; natural language processing; ACUTE LYMPHOBLASTIC-LEUKEMIA; CHILDREN; IMPACT; HYPERGLYCEMIA; ANTIBODIES; THERAPY;
D O I
10.3390/jcm13061788
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Datasets on rare diseases, like pediatric acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL), have small sample sizes that hinder machine learning (ML). The objective was to develop an interpretable ML framework to elucidate actionable insights from small tabular rare disease datasets. Methods: The comprehensive framework employed optimized data imputation and sampling, supervised and unsupervised learning, and literature-based discovery (LBD). The framework was deployed to assess treatment-related infection in pediatric AML and ALL. Results: An interpretable decision tree classified the risk of infection as either "high risk" or "low risk" in pediatric ALL (n = 580) and AML (n = 132) with accuracy of similar to 79%. Interpretable regression models predicted the discrete number of developed infections with a mean absolute error (MAE) of 2.26 for bacterial infections and an MAE of 1.29 for viral infections. Features that best explained the development of infection were the chemotherapy regimen, cancer cells in the central nervous system at initial diagnosis, chemotherapy course, leukemia type, Down syndrome, race, and National Cancer Institute risk classification. Finally, SemNet 2.0, an open-source LBD software that links relationships from 33+ million PubMed articles, identified additional features for the prediction of infection, like glucose, iron, neutropenia-reducing growth factors, and systemic lupus erythematosus (SLE). Conclusions: The developed ML framework enabled state-of-the-art, interpretable predictions using rare disease tabular datasets. ML model performance baselines were successfully produced to predict infection in pediatric AML and ALL.
引用
收藏
页数:24
相关论文
共 81 条
[1]   Not Just Digital Pathology, Intelligent Digital Pathology [J].
Acs, Balazs ;
Rimm, David L. .
JAMA ONCOLOGY, 2018, 4 (03) :403-404
[2]  
Agrawal R., 1994, P 20 INT C VER LARG, P487
[3]  
Al-Hussaini I, 2019, PR MACH LEARN RES, V106
[4]  
Al-Hussaini Irfan, 2023, Proc IEEE Int Conf Acoust Speech Signal Process, V2023, DOI 10.1109/icassp49357.2023.10097091
[5]   SeizFt: Interpretable Machine Learning for Seizure Detection Using Wearables [J].
Al-Hussaini, Irfan ;
Mitchell, Cassie S. .
BIOENGINEERING-BASEL, 2023, 10 (08)
[6]  
Al-Hussaini I, 2022, Arxiv, DOI [arXiv:2211.03282, 10.48550/arXiv.2211.03282, DOI 10.48550/ARXIV.2211.03282]
[7]  
Arik SO, 2021, AAAI CONF ARTIF INTE, V35, P6679
[8]   Malignancy in Pediatric-onset Systemic Lupus Erythematosus [J].
Bernatsky, Sasha ;
Clarke, Ann E. ;
Niaki, Omid Zahedi ;
Labrecque, Jeremy ;
Schanberg, Laura E. ;
Silverman, Earl D. ;
Hayward, Kristen ;
Imundo, Lisa ;
Brunner, Hermine I. ;
Haines, Kathleen A. ;
Cron, Randy Q. ;
Oen, Kiem ;
Wagner-Weiner, Linda ;
Rosenberg, Alan M. ;
O'Neil, Kathleen M. ;
Duffy, Ciaran M. ;
von Scheven, Emily ;
Joseph, Lawrence ;
Lee, Jennifer L. ;
Ramsey-Goldman, Rosalind .
JOURNAL OF RHEUMATOLOGY, 2017, 44 (10) :1484-1486
[9]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[10]   Systemic viral infection in children receiving chemotherapy for acute leukemia [J].
Buus-Gehrig, Constanze ;
Bochennek, Konrad ;
Hennies, Marc T. ;
Klingebiel, Thomas ;
Groll, Andreas H. ;
Lehrnbecher, Thomas .
PEDIATRIC BLOOD & CANCER, 2020, 67 (12)