Developing an Explainable Machine Learning-Based Personalised Dementia Risk Prediction Model: A Transfer Learning Approach With Ensemble Learning Algorithms

被引:43
作者
Danso, Samuel O. [1 ]
Zeng, Zhanhang [2 ]
Muniz-Terrera, Graciela [1 ]
Ritchie, Craig W. [1 ]
机构
[1] Univ Edinburgh, Edinburgh Dementia Prevent, Ctr Clin Brain Sci, Sch Med, Edinburgh, Midlothian, Scotland
[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
来源
FRONTIERS IN BIG DATA | 2021年 / 4卷
关键词
early detection; risk factors; Alzheimer' s; personalised dementia risk; explainable AI model; ensemble-based learning; MILD COGNITIVE IMPAIRMENT; PREVENTION; ALZHEIMERS; INTERVENTION;
D O I
10.3389/fdata.2021.613047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Alzheimer's disease (AD) has its onset many decades before dementia develops, and work is ongoing to characterise individuals at risk of decline on the basis of early detection through biomarker and cognitive testing as well as the presence/absence of identified risk factors. Risk prediction models for AD based on various computational approaches, including machine learning, are being developed with promising results. However, these approaches have been criticised as they are unable to generalise due to over-reliance on one data source, poor internal and external validations, and lack of understanding of prediction models, thereby limiting the clinical utility of these prediction models. We propose a framework that employs a transfer-learning paradigm with ensemble learning algorithms to develop explainable personalised risk prediction models for dementia. Our prediction models, known as source models, are initially trained and tested using a publicly available dataset (n = 84,856, mean age = 69 years) with 14 years of follow-up samples to predict the individual risk of developing dementia. The decision boundaries of the best source model are further updated by using an alternative dataset from a different and much younger population (n = 473, mean age = 52 years) to obtain an additional prediction model known as the target model. We further apply the SHapely Additive exPlanation (SHAP) algorithm to visualise the risk factors responsible for the prediction at both population and individual levels. The best source model achieves a geometric accuracy of 87%, specificity of 99%, and sensitivity of 76%. In comparison to a baseline model, our target model achieves better performance across several performance metrics, within an increase in geometric accuracy of 16.9%, specificity of 2.7%, and sensitivity of 19.1%, an area under the receiver operating curve (AUROC) of 11% and a transfer learning efficacy rate of 20.6%. The strength of our approach is the large sample size used in training the source model, transferring and applying the "knowledge" to another dataset from a different and undiagnosed population for the early detection and prediction of dementia risk, and the ability to visualise the interaction of the risk factors that drive the prediction. This approach has direct clinical utility.
引用
收藏
页数:14
相关论文
共 42 条
[1]   Predicting risk of dementia in older adults The late-life dementia risk index [J].
Barnes, D. E. ;
Covinsky, K. E. ;
Whitmer, R. A. ;
Kuller, L. H. ;
Lopez, O. L. ;
Yaffe, K. .
NEUROLOGY, 2009, 73 (03) :173-179
[2]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[3]   Data Resource Profile: The Survey of Health, Ageing and Retirement in Europe (SHARE) [J].
Boersch-Supan, Axel ;
Brandt, Martina ;
Hunkler, Christian ;
Kneip, Thorsten ;
Korbmacher, Julie ;
Malter, Frederic ;
Schaan, Barbara ;
Stuck, Stephanie ;
Zuber, Sabrina .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2013, 42 (04) :992-1001
[4]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[5]  
Breiman L., 1984, STAT PROBABILITY SER, DOI 10.1201/9781315139470
[6]  
BUCK SF, 1960, J ROY STAT SOC B, V22, P302
[7]   Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission [J].
Caruana, Rich ;
Lou, Yin ;
Gehrke, Johannes ;
Koch, Paul ;
Sturm, Marc ;
Elhadad, Noemie .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :1721-1730
[8]   RNN-based longitudinal analysis for diagnosis of Alzheimer's disease [J].
Cui, Ruoxuan ;
Liu, Manhua .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2019, 73 :1-10
[9]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[10]   Cognitive Outcomes of Long-term Benzodiazepine and Related Drug (BDZR) Use in People Living With Mild to Moderate Alzheimer's Disease: Results From NILVAD [J].
Dyer, Adam H. ;
Murphy, Claire ;
Lawlor, Brian ;
Kennelly, Sean P. ;
Segurado, Ricardo ;
Kennelly, Sean ;
Rikkert, Marcel G. M. Olde ;
Howard, Robert ;
Pasquier, Florence ;
Brjesson-Hanson, Anne ;
Tsolaki, Magda ;
Lucca, Ugo ;
Molloy, D. William ;
Coen, Robert ;
Riepe, Matthias W. ;
Kalman, Janos ;
Kenny, Rose Anne ;
Cregg, Fiona ;
O'Dwyer, Sarah ;
Walsh, Cathal ;
Adams, Jessica ;
Banzi, Rita ;
Breuilh, Laetitia ;
Daly, Leslie ;
Hendrix, Suzanne ;
Aisen, Paul ;
Gaynor, Siobhan ;
Sheikhi, Ali ;
Taekema, Diana G. ;
Verhey, Frans R. ;
Nemni, Raffaello ;
Nobili, Flavio ;
Franceschi, Massimo ;
Frisoni, Giovanni ;
Zanetti, Orazio ;
Konsta, Anastasia ;
Anastasios, Orologas ;
Nenopoulou, Styliani ;
Tsolaki-Tagaraki, Fani ;
Pakaski, Magdolna ;
Dereeper, Olivier ;
de la Sayette, Vincent ;
Senechal, Olivier ;
Lavenu, Isabelle ;
Devendeville, Agnes ;
Calais, Gauthier ;
Crawford, Fiona ;
Mullan, Michael ;
Aalten, Pauline ;
Berglund, Maria A. .
JOURNAL OF THE AMERICAN MEDICAL DIRECTORS ASSOCIATION, 2020, 21 (02) :194-200