Using Hypothesis-Led Machine Learning and Hierarchical Cluster Analysis to Identify Disease Pathways Prior to Dementia: Longitudinal Cohort Study

被引:7
|
作者
Huang, Shih-Tsung [1 ,2 ]
Hsiao, Fei-Yuan [3 ,4 ,5 ]
Tsai, Tsung-Hsien [6 ]
Chen, Pei-Jung [6 ]
Peng, Li-Ning [2 ,7 ]
Chen, Liang-Kung [2 ,7 ,8 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Dept Pharm, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Ctr Hlth Longev & Aging Sci, Taipei, Taiwan
[3] Natl Taiwan Univ, Grad Inst Clin Pharm, Coll Med, Taipei, Taiwan
[4] Natl Taiwan Univ, Coll Med, Sch Pharm, Taipei, Taiwan
[5] Natl Taiwan Univ Hosp, Dept Pharm, Taipei, Taiwan
[6] Acer, Adv Tech Business Unit, New Taipei, Taiwan
[7] Taipei Vet Gen Hosp, Ctr Geriatr & Gerontol, Taipei, Taiwan
[8] Taipei Vet Gen Hosp, Taipei Municipal Gan Dau Hosp, Taipei, Taiwan
关键词
dementia; machine learning; cluster analysis; disease; condition; symptoms; data; data set; cardiovascular; neuropsychiatric; infection; mobility; mental conditions; development; COGNITIVE IMPAIRMENT; CAROTID STENOSIS; RISK; ASSOCIATIONS; POPULATION; ADULTS;
D O I
10.2196/41858
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Dementia development is a complex process in which the occurrence and sequential relationships of different diseases or conditions may construct specific patterns leading to incident dementia.Objective: This study aimed to identify patterns of disease or symptom clusters and their sequences prior to incident dementia using a novel approach incorporating machine learning methods. Methods: Using Taiwan's National Health Insurance Research Database, data from 15,700 older people with dementia and 15,700 nondementia controls matched on age, sex, and index year (n=10,466, 67% for the training data set and n=5234, 33% for the testing data set) were retrieved for analysis. Using machine learning methods to capture specific hierarchical disease triplet clusters prior to dementia, we designed a study algorithm with four steps: (1) data preprocessing, (2) disease or symptom pathway selection, (3) model construction and optimization, and (4) data visualization. Results: Among 15,700 identified older people with dementia, 10,466 and 5234 subjects were randomly assigned to the training and testing data sets, and 6215 hierarchical disease triplet clusters with positive correlations with dementia onset were identified. We subsequently generated 19,438 features to construct prediction models, and the model with the best performance was support vector machine (SVM) with the by-group LASSO (least absolute shrinkage and selection operator) regression method (total corresponding features=2513; accuracy=0.615; sensitivity=0.607; specificity=0.622; positive predictive value=0.612; negative predictive value=0.619; area under the curve=0.639). In total, this study captured 49 hierarchical disease triplet clusters related to dementia development, and the most characteristic patterns leading to incident dementia started with cardiovascular conditions (mainly hypertension), cerebrovascular disease, mobility disorders, or infections, followed by neuropsychiatric conditions.Conclusions: Dementia development in the real world is an intricate process involving various diseases or conditions, their co-occurrence, and sequential relationships. Using a machine learning approach, we identified 49 hierarchical disease triplet clusters with leading roles (cardio-or cerebrovascular disease) and supporting roles (mental conditions, locomotion difficulties, infections, and nonspecific neurological conditions) in dementia development. Further studies using data from other countries are needed to validate the prediction algorithms for dementia development, allowing the development of comprehensive strategies to prevent or care for dementia in the real world.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Classifying Alzheimer's disease and frontotemporal dementia using machine learning with cross-sectional and longitudinal magnetic resonance imaging data
    Perez-Millan, Agnes
    Contador, Jose
    Junca-Parella, Jordi
    Bosch, Beatriz
    Borrell, Laia
    Tort-Merino, Adria
    Falgas, Neus
    Borrego-Ecija, Sergi
    Bargallo, Nuria
    Rami, Lorena
    Balasa, Mircea
    Llado, Albert
    Sanchez-Valle, Raquel
    Sala-Llonch, Roser
    HUMAN BRAIN MAPPING, 2023, 44 (06) : 2234 - 2244
  • [22] Bone texture analysis for prediction of incident radiographic hip osteoarthritis using machine learning: data from the Cohort Hip and Cohort Knee (CHECK) study
    Hirvasniemi, J.
    Gielis, W. P.
    Arbabi, S.
    Agricola, R.
    van Spil, W. E.
    Arbabi, V.
    Weinans, H.
    OSTEOARTHRITIS AND CARTILAGE, 2019, 27 (06) : 906 - 914
  • [23] Machine-learning-based feature selection to identify attention-deficit hyperactivity disorder using whole-brain white matter microstructure: A longitudinal study
    Chiang, Huey-Ling
    Wu, Chi-Shin
    Chen, Chang-Le
    Tseng, Wen-Yih Isaac
    Gau, Susan Shur-Fen
    ASIAN JOURNAL OF PSYCHIATRY, 2024, 97
  • [24] Subtyping of early-onset Parkinson's disease using cluster analysis: A large cohort study
    Zhou, Zhou
    Zhou, Xiaoxia
    Xiang, Yaqin
    Zhao, Yuwen
    Pan, Hongxu
    Wu, Juan
    Xu, Qian
    Chen, Yase
    Sun, Qiying
    Wu, Xinyin
    Zhu, Jianping
    Wu, Xuehong
    Li, Jianhua
    Yan, Xinxiang
    Guo, Jifeng
    Tang, Beisha
    Lei, Lifang
    Liu, Zhenhua
    FRONTIERS IN AGING NEUROSCIENCE, 2022, 14
  • [25] Study of myopia progression and risk factors in Hubei children aged 7–10 years using machine learning: a longitudinal cohort
    Wenping Li
    Yuyang Tu
    Lianhong Zhou
    Runting Ma
    Yuanjin Li
    Diewenjie Hu
    Cancan Zhang
    Yi Lu
    BMC Ophthalmology, 24
  • [26] Predictive Biomarkers for Postmyocardial Infarction Heart Failure Using Machine Learning: A Secondary Analysis of a Cohort Study
    Li, Feng
    Sun, Jin-Yu
    Wu, Li-Da
    Qu, Qiang
    Zhang, Zhen-Ye
    Chen, Xu-Fei
    Kan, Jun-Yan
    Wang, Chao
    Wang, Ru-Xing
    EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE, 2021, 2021
  • [27] Study of myopia progression and risk factors in Hubei children aged 7-10 years using machine learning: a longitudinal cohort
    Li, Wenping
    Tu, Yuyang
    Zhou, Lianhong
    Ma, Runting
    Li, Yuanjin
    Hu, Diewenjie
    Zhang, Cancan
    Lu, Yi
    BMC OPHTHALMOLOGY, 2024, 24 (01)
  • [28] Time-Frequency functional connectivity alterations in Alzheimer's disease and frontotemporal dementia: An EEG analysis using machine learning
    Zheng, Huang
    Xiao, Han
    Zhang, Yinan
    Jia, Haozhe
    Ma, Xing
    Gan, Yiqun
    CLINICAL NEUROPHYSIOLOGY, 2025, 170 : 110 - 119
  • [29] Using machine learning to identify key subject categories predicting the pre-clerkship and clerkship performance: 8-year cohort study
    Huang, Shiau-Shian
    Lin, Yu-Fan
    Huang, Anna YuQing
    Lin, Ji-Yang
    Yang, Ying-Ying
    Lin, Sheng-Min
    Lin, Wen-Yu
    Huang, Pin-Hsiang
    Chen, Tzu-Yao
    Yang, Stephen J. H.
    Lirng, Jiing-Feng
    Chen, Chen-Huan
    JOURNAL OF THE CHINESE MEDICAL ASSOCIATION, 2024, 87 (06) : 609 - 614
  • [30] Predictive model for acute respiratory distress syndrome events in ICU patients in China using machine learning algorithms: a secondary analysis of a cohort study
    Ding, Xian-Fei
    Li, Jin-Bo
    Liang, Huo-Yan
    Wang, Zong-Yu
    Jiao, Ting-Ting
    Liu, Zhuang
    Yi, Liang
    Bian, Wei-Shuai
    Wand, Shu-Peng
    Zhu, Xi
    Sun, Tong-Wen
    JOURNAL OF TRANSLATIONAL MEDICINE, 2019, 17 (01)