Multi-modality risk prediction of cardiovascular diseases for breast cancer cohort in the All of Us Research Program

被引:1
|
作者
Yang, Han [1 ]
Zhou, Sicheng [1 ]
Rao, Zexi [2 ]
Zhao, Chen [2 ]
Cui, Erjia [2 ]
Shenoy, Chetan [3 ]
Blaes, Anne H. [4 ]
Paidimukkala, Nishitha [1 ]
Wang, Jinhua [5 ]
Hou, Jue [2 ]
Zhang, Rui [6 ]
机构
[1] Univ Minnesota, Inst Hlth Informat, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Sch Publ Hlth, Div Biostat & Hlth Data Sci, 2221 Univ Ave SE,Suite 200, Minneapolis, MN 55414 USA
[3] Univ Minnesota, Med Ctr, Dept Med, Cardiovasc Div, Minneapolis, MN 55455 USA
[4] Univ Minnesota, Div Hematol Oncol & Transplantat, Minneapolis, MN 55455 USA
[5] Univ Minnesota, Masonic Canc Ctr, Minneapolis, MN 55455 USA
[6] Univ Minnesota, Dept Surg, Div Comp Hlth Sci, 308 Harvard St SE, Minneapolis, MN 55455 USA
基金
美国国家卫生研究院;
关键词
cardiovascular disease; breast cancer; predictive model; All of Us; SOCIAL DETERMINANTS; SURVIVAL; MODELS; TIME; ASSOCIATIONS; STATEMENT; SELECTION; IMPACT; INDEX; LASSO;
D O I
10.1093/jamia/ocae199
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective This study leverages the rich diversity of the All of Us Research Program (All of Us)'s dataset to devise a predictive model for cardiovascular disease (CVD) in breast cancer (BC) survivors. Central to this endeavor is the creation of a robust data integration pipeline that synthesizes electronic health records (EHRs), patient surveys, and genomic data, while upholding fairness across demographic variables.Materials and Methods We have developed a universal data wrangling pipeline to process and merge heterogeneous data sources of the All of Us dataset, address missingness and variance in data, and align disparate data modalities into a coherent framework for analysis. Utilizing a composite feature set including EHR, lifestyle, and social determinants of health (SDoH) data, we then employed Adaptive Lasso and Random Forest regression models to predict 6 CVD outcomes. The models were evaluated using the c-index and time-dependent Area Under the Receiver Operating Characteristic Curve over a 10-year period.Results The Adaptive Lasso model showed consistent performance across most CVD outcomes, while the Random Forest model excelled particularly in predicting outcomes like transient ischemic attack when incorporating the full multi-model feature set. Feature importance analysis revealed age and previous coronary events as dominant predictors across CVD outcomes, with SDoH clustering labels highlighting the nuanced impact of social factors.Discussion The development of both Cox-based predictive model and Random Forest Regression model represents the extensive application of the All of Us, in integrating EHR and patient surveys to enhance precision medicine. And the inclusion of SDoH clustering labels revealed the significant impact of sociobehavioral factors on patient outcomes, emphasizing the importance of comprehensive health determinants in predictive models. Despite these advancements, limitations include the exclusion of genetic data, broad categorization of CVD conditions, and the need for fairness analyses to ensure equitable model performance across diverse populations. Future work should refine clinical and social variable measurements, incorporate advanced imputation techniques, and explore additional predictive algorithms to enhance model precision and fairness.Conclusion This study demonstrates the liability of the All of Us's diverse dataset in developing a multi-modality predictive model for CVD in BC survivors risk stratification in oncological survivorship. The data integration pipeline and subsequent predictive models establish a methodological foundation for future research into personalized healthcare.
引用
收藏
页码:2800 / 2810
页数:11
相关论文
共 50 条
  • [21] Alopecia areata and cardiovascular comorbidities: A cross-sectional analysis of the All of Us research program
    Nohria, Ambika
    Shah, Jill T.
    Desai, Deesha
    Alhanshali, Lina
    Ingrassia, Jenne
    Femia, Alisa
    Garshick, Michael
    Shapiro, Jerry
    Sicco, Kristen I. Lo
    JAAD INTERNATIONAL, 2024, 16 : 46 - 48
  • [22] Risk of cardiovascular diseases in cancer patients: A nationwide representative cohort study in Taiwan
    Yeh, Tzu-Lin
    Hsu, Min-Shu
    Hsu, Hsin-Yin
    Tsai, Ming-Chieh
    Jhuang, Jing-Rong
    Chiang, Chun-Ju
    Lee, Wen-Chung
    Chien, Kuo-Liong
    BMC CANCER, 2022, 22 (01)
  • [23] Multi-Morbidity and Risk of Breast Cancer among Women in the UK Biobank Cohort
    Henyoh, Afi Mawulawoe Sylvie
    Allodji, Rodrigue S. S.
    de Vathaire, Florent
    Boutron-Ruault, Marie-Christine
    Journy, Neige M. Y.
    Tran, Thi-Van-Trinh
    CANCERS, 2023, 15 (04)
  • [24] Risk of cardiovascular diseases in cancer patients: A nationwide representative cohort study in Taiwan
    Tzu-Lin Yeh
    Min-Shu Hsu
    Hsin-Yin Hsu
    Ming-Chieh Tsai
    Jing-Rong Jhuang
    Chun-Ju Chiang
    Wen-Chung Lee
    Kuo-Liong Chien
    BMC Cancer, 22
  • [25] Red Meat Intake and the Risk of Cardiovascular Diseases: A Prospective Cohort Study in the Million Veteran Program
    Wang, Dong
    Li, Yanping
    Nguyen, Xuan-Mai
    Ho, Yuk-Lam
    Hu, Frank B.
    Willett, Walter C.
    Wilson, Peter W. F.
    Cho, Kelly
    Djousse, Luc
    JOURNAL OF NUTRITION, 2024, 154 (03): : 886 - 895
  • [26] Genomic risk prediction of coronary artery disease in women with breast cancer: a prospective cohort study
    Liou, Lathan
    Kaptoge, Stephen
    Dennis, Joe
    Shah, Mitul
    Tyrer, Jonathan
    Inouye, Michael
    Easton, Douglas F.
    Pharoah, Paul D. P.
    BREAST CANCER RESEARCH, 2021, 23 (01)
  • [27] Genomic risk prediction of coronary artery disease in women with breast cancer: a prospective cohort study
    Lathan Liou
    Stephen Kaptoge
    Joe Dennis
    Mitul Shah
    Jonathan Tyrer
    Michael Inouye
    Douglas F. Easton
    Paul D. P. Pharoah
    Breast Cancer Research, 23
  • [28] Greater absolute risk for all subtypes of breast cancer in the US than Malaysia
    Horne, Hisani N.
    Devi, C. R. Beena
    Sung, Hyuna
    Tang, Tieng Swee
    Rosenberg, Philip S.
    Hewitt, Stephen M.
    Sherman, Mark E.
    Anderson, William F.
    Yang, Xiaohong R.
    BREAST CANCER RESEARCH AND TREATMENT, 2015, 149 (01) : 285 - 291
  • [29] Greater absolute risk for all subtypes of breast cancer in the US than Malaysia
    Hisani N. Horne
    C. R. Beena Devi
    Hyuna Sung
    Tieng Swee Tang
    Philip S. Rosenberg
    Stephen M. Hewitt
    Mark E. Sherman
    William F. Anderson
    Xiaohong R. Yang
    Breast Cancer Research and Treatment, 2015, 149 : 285 - 291
  • [30] Longitudinal MRI-Driven Multi-Modality Approach for Predicting Pathological Complete Response and B Cell Infiltration in Breast Cancer
    Huang, Yu-Hong
    Shi, Zhen-Yi
    Zhu, Teng
    Zhou, Tian-Han
    Li, Yi
    Li, Wei
    Qiu, Han
    Wang, Si-Qi
    He, Li-Fang
    Wu, Zhi-Yong
    Lin, Ying
    Wang, Qian
    Gu, Wen-Chao
    Gu, Chang-Cong
    Song, Xin-Yang
    Zhou, Yang
    Guan, Dao-Gang
    Wang, Kun
    ADVANCED SCIENCE, 2025,