Interpretable machine learning prediction of all-cause mortality

被引:52
作者
Qiu, Wei [1 ]
Chen, Hugh [1 ]
Dincer, Ayse Berceste [1 ]
Lundberg, Scott [2 ]
Kaeberlein, Matt [3 ]
Lee, Su-In [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
[2] Microsoft Res, Redmond, WA USA
[3] Univ Washington, Dept Lab Med & Pathol, Seattle, WA USA
来源
COMMUNICATIONS MEDICINE | 2022年 / 2卷 / 01期
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
CELL DISTRIBUTION WIDTH; BODY-MASS INDEX; SERUM POTASSIUM LEVELS; 2ND NATIONAL-HEALTH; BLOOD LEAD LEVELS; FAT-FREE MASS; FOLLOW-UP; CARDIOVASCULAR-DISEASE; CALF CIRCUMFERENCE; RISK;
D O I
10.1038/s43856-022-00180-x
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Background Unlike linear models which are traditionally used to study all-cause mortality, complex machine learning models can capture non-linear interrelations and provide opportunities to identify unexplored risk factors. Explainable artificial intelligence can improve prediction accuracy over linear models and reveal great insights into outcomes like mortality. This paper comprehensively analyzes all-cause mortality by explaining complex machine learning models. Methods We propose the IMPACT framework that uses XAI technique to explain a state-of-the-art tree ensemble mortality prediction model. We apply IMPACT to understand all-cause mortality for 1-, 3-, 5-, and 10-year follow-up times within the NHANES dataset, which contains 47,261 samples and 151 features. Results We show that IMPACT models achieve higher accuracy than linear models and neural networks. Using IMPACT, we identify several overlooked risk factors and interaction effects. Furthermore, we identify relationships between laboratory features and mortality that may suggest adjusting established reference intervals. Finally, we develop highly accurate, efficient and interpretable mortality risk scores that can be used by medical professionals and individuals without medical expertise. We ensure generalizability by performing temporal validation of the mortality risk scores and external validation of important findings with the UK Biobank dataset. Conclusions IMPACT's unique strength is the explainable prediction, which provides insights into the complex, non-linear relationships between mortality and features, while maintaining high accuracy. Our explainable risk scores could help individuals improve self-awareness of their health status and help clinicians identify patients with high risk. IMPACT takes a consequential step towards bringing contemporary developments in XAI to epidemiology.
引用
收藏
页数:15
相关论文
共 66 条
[1]   A propensity-matched study of the association of low serum potassium levels and mortality in chronic heart failure [J].
Ahmed, Ali ;
Zannad, Faiez ;
Love, Thomas E. ;
Tallaj, Jose ;
Gheorghiade, Mihai ;
Ekundayo, Olaniyi James ;
Pitt, Bertram .
EUROPEAN HEART JOURNAL, 2007, 28 (11) :1334-1343
[2]   Differential associations of body mass index and adiposity with all-cause mortality among men in the first and second National Health and Nutrition Examination Surveys (NHANES I and NHANES II) follow-up studies [J].
Allison, DB ;
Zhu, SK ;
Plankey, M ;
Faith, MS ;
Heo, M .
INTERNATIONAL JOURNAL OF OBESITY, 2002, 26 (03) :410-416
[3]   Daily Sitting Time and All-Cause Mortality: A Meta-Analysis [J].
Chau, Josephine Y. ;
Grunseit, Anne C. ;
Chey, Tien ;
Stamatakis, Emmanuel ;
Brown, Wendy J. ;
Matthews, Charles E. ;
Bauman, Adrian E. ;
van der Ploeg, Hidde P. .
PLOS ONE, 2013, 8 (11)
[4]   SERUM-ALBUMIN LEVEL AND PHYSICAL-DISABILITY AS PREDICTORS OF MORTALITY IN OLDER PERSONS [J].
CORTI, MC ;
GURALNIK, JM ;
SALIVE, ME ;
SORKIN, JD .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1994, 272 (13) :1036-1042
[5]   Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning [J].
Coudray, Nicolas ;
Ocampo, Paolo Santiago ;
Sakellaropoulos, Theodore ;
Narula, Navneet ;
Snuderl, Matija ;
Fenyo, David ;
Moreira, Andre L. ;
Razavian, Narges ;
Tsirigos, Aristotelis .
NATURE MEDICINE, 2018, 24 (10) :1559-+
[6]  
Curtin Lester R, 2013, Vital Health Stat 2, P1
[7]  
Curtin Lester R, 2012, Vital Health Stat 2, P1
[8]   Interaction on an Additive Scale [J].
de Mutsert, Renee ;
de Jager, Dinanda J. ;
Jager, Kitty J. ;
Zoccali, Carmine ;
Dekker, Friedo W. .
NEPHRON CLINICAL PRACTICE, 2011, 119 (02) :C154-C157
[9]   Dermatologist-level classification of skin cancer with deep neural networks [J].
Esteva, Andre ;
Kuprel, Brett ;
Novoa, Roberto A. ;
Ko, Justin ;
Swetter, Susan M. ;
Blau, Helen M. ;
Thrun, Sebastian .
NATURE, 2017, 542 (7639) :115-+
[10]   PredRSA: a gradient boosted regression trees approach for predicting protein solvent accessibility [J].
Fan, Chao ;
Liu, Diwei ;
Huang, Rui ;
Chen, Zhigang ;
Deng, Lei .
BMC BIOINFORMATICS, 2016, 17