Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study

被引:5
作者
Nguyen, Hieu T. [1 ]
Vasconcellos, Henrique D. [2 ]
Keck, Kimberley [2 ]
Reis, Jared P. [3 ]
Lewis, Cora E. [4 ]
Sidney, Steven [5 ]
Lloyd-Jones, Donald M. [6 ]
Schreiner, Pamela J. [7 ]
Guallar, Eliseo [8 ]
Wu, Colin O. [3 ]
Lima, Joao A. C. [2 ,9 ]
Ambale-Venkatesh, Bharath [9 ]
机构
[1] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Dept Cardiol, Baltimore, MD USA
[3] NHLBI, Bldg 10, Bethesda, MD 20892 USA
[4] Univ Alabama Birmingham, Sch Publ Hlth, Dept Epidemiol, Birmingham, AL 35294 USA
[5] Kaiser Permanente, Div Res, Oakland, CA USA
[6] Northwestern Univ, Dept Prevent Med, Chicago, IL 60611 USA
[7] Univ Minnesota, Sch Publ Hlth, Minneapolis, MN USA
[8] Johns Hopkins Univ, Sch Publ Hlth, Dept Epidemiol, Baltimore, MD 21205 USA
[9] Johns Hopkins Univ, Dept Radiol, Baltimore, MD 21218 USA
关键词
Longitudinal data; Explainable AI; Survival analysis; Risk prediction; Repeated measures; Personalized medicine; Time-varying covariates; SHAP; TIME; CARDIA; TIME-TO-EVENT; ELECTRONIC HEALTH RECORDS; RISK PREDICTION; BLOOD-PRESSURE; HEART-FAILURE; JOINT MODEL; MIDDLE-AGE; DISEASE; TRAJECTORIES; PACKAGE;
D O I
10.1186/s12874-023-01845-4
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BackgroundMultivariate longitudinal data are under-utilized for survival analysis compared to cross-sectional data (CS - data collected once across cohort). Particularly in cardiovascular risk prediction, despite available methods of longitudinal data analysis, the value of longitudinal information has not been established in terms of improved predictive accuracy and clinical applicability.MethodsWe investigated the value of longitudinal data over and above the use of cross-sectional data via 6 distinct modeling strategies from statistics, machine learning, and deep learning that incorporate repeated measures for survival analysis of the time-to-cardiovascular event in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We then examined and compared the use of model-specific interpretability methods (Random Survival Forest Variable Importance) and model-agnostic methods (SHapley Additive exPlanation (SHAP) and Temporal Importance Model Explanation (TIME)) in cardiovascular risk prediction using the top-performing models.ResultsIn a cohort of 3539 participants, longitudinal information from 35 variables that were repeatedly collected in 6 exam visits over 15 years improved subsequent long-term (17 years after) risk prediction by up to 8.3% in C-index compared to using baseline data (0.78 vs. 0.72), and up to approximately 4% compared to using the last observed CS data (0.75). Time-varying AUC was also higher in models using longitudinal data (0.86-0.87 at 5 years, 0.79-0.81 at 10 years) than using baseline or last observed CS data (0.80-0.86 at 5 years, 0.73-0.77 at 10 years). Comparative model interpretability analysis revealed the impact of longitudinal variables on model prediction on both the individual and global scales among different modeling strategies, as well as identifying the best time windows and best timing within that window for event prediction. The best strategy to incorporate longitudinal data for accuracy was time series massive feature extraction, and the easiest interpretable strategy was trajectory clustering.ConclusionOur analysis demonstrates the added value of longitudinal data in predictive accuracy and epidemiological utility in cardiovascular risk survival analysis in young adults via a unified, scalable framework that compares model performance and explainability. The framework can be extended to a larger number of variables and other longitudinal modeling methods.
引用
收藏
页数:19
相关论文
共 24 条
  • [21] RETRACTED: Prediction Models for One-Year Survival of Adult Patients with Acute Kidney Injury: A Longitudinal Study Based on the Data from the Medical Information Mart for Intensive Care III Database (Retracted Article)
    Zhou, Lifang
    Chu, Laping
    Peng, Junqiong
    Yin, Shenhan
    Yu, Yafen
    EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE, 2022, 2022
  • [22] Global, regional, and national burdens of heart failure in adolescents and young adults aged 10-24 years from 1990 to 2021: an analysis of data from the Global Burden of Disease Study 2021
    Yang, Chengzhi
    Jia, Yuhe
    Zhang, Changlin
    Jin, Zening
    Ma, Yue
    Bi, Xuanye
    Tian, Aiju
    ECLINICALMEDICINE, 2025, 79
  • [23] Assessing the role of depressive symptoms in the association between social engagement and cognitive functioning among older adults: analysis of cross-sectional data from the Longitudinal Aging Study in India (LASI)
    Kumar, Manish
    Muhammad, T.
    Dwivedi, Laxmi Kant
    BMJ OPEN, 2022, 12 (10):
  • [24] Response to Drs. Rasmussen and Pareek regarding our paper in Int. J. Cardiol. 2018; 250: 247-252 Hs-cTroponins for the prediction of recurrent cardiovascular events in patients with established CHD - A comparative analysis from the KAROLA study
    Jansen, Henning
    Koenig, Wolfgang
    Rothenbacher, Dietrich
    INTERNATIONAL JOURNAL OF CARDIOLOGY, 2018, 257 : 314 - 314