Predicting hypertension onset from longitudinal electronic health records with deep learning

被引:11
作者
Datta, Suparno [1 ,2 ]
Morassi Sasso, Ariane [1 ,2 ]
Kiwit, Nina [1 ]
Bose, Subhronil [1 ]
Nadkarni, Girish [1 ,2 ,3 ]
Miotto, Riccardo [2 ,4 ]
Boettinger, Erwin P. [1 ,2 ,3 ,5 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY 10029 USA
[3] Icahn Sch Med Mt Sinai, Dept Med, New York, NY 10029 USA
[4] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[5] Icahn Sch Med Mt Sinai, Windreich Dept Artificial Intelligence & Human Hl, New York, NY 10029 USA
基金
美国国家卫生研究院;
关键词
machine learning; electronic health records; deep learning; hypertension; HIGH BLOOD-PRESSURE; INCIDENT HYPERTENSION; AMERICAN-COLLEGE; RISK; PREVENTION; MANAGEMENT; ADULTS;
D O I
10.1093/jamiaopen/ooac097
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. Materials and Methods: We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A "train and validation") using cross-validation, and then applied the models to a second dataset (dataset B "test") to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. Results: With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the "train and validation" dataset A and 0.94 in the "test" dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. Conclusion: These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Treatment effect prediction with adversarial deep learning using electronic health records
    Chu, Jiebin
    Dong, Wei
    Wang, Jinliang
    He, Kunlun
    Huang, Zhengxing
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (Suppl 4)
  • [32] Treatment effect prediction with adversarial deep learning using electronic health records
    Jiebin Chu
    Wei Dong
    Jinliang Wang
    Kunlun He
    Zhengxing Huang
    [J]. BMC Medical Informatics and Decision Making, 20
  • [33] Deep Learning Prediction of Mild Cognitive Impairment using Electronic Health Records
    Fouladvand, Sajjad
    Mielke, Michelle M.
    Vassilaki, Maria
    St Sauver, Jennifer
    Petersen, Ronald C.
    Sohn, Sunghwan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 799 - 806
  • [34] Quantification of abdominal fat from computed tomography using deep learning and its association with electronic health records in an academic biobank
    MacLean, Matthew T.
    Jehangir, Qasim
    Vujkovic, Marijana
    Ko, Yi-An
    Litt, Harold
    Borthakur, Arijitt
    Sagreiya, Hersh
    Rosen, Mark
    Mankoff, David A.
    Schnall, Mitchell D.
    Shou, Haochang
    Chirinos, Julio
    Damrauer, Scott M.
    Torigian, Drew A.
    Carr, Rotonya
    Rader, Daniel J.
    Witschey, Walter R.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (06) : 1178 - 1187
  • [35] A deep learning method to detect opioid prescription and opioid use disorder from electronic health records
    Kashyap, Aditya
    Callison-Burch, Chris
    Boland, Mary Regina
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2023, 171
  • [36] Learning from heterogeneous temporal data in electronic health records
    Zhao, Jing
    Papapetrou, Panagiotis
    Asker, Lars
    Bostrom, Henrik
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 65 : 105 - 119
  • [37] Predicting sequenced dental treatment plans from electronic dental records using deep learning
    Chen, Haifan
    Liu, Pufan
    Chen, Zhaoxing
    Chen, Qingxiao
    Wen, Zaiwen
    Xie, Ziqing
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 147
  • [38] Predicting early-onset COPD risk in adults aged 20-50 using electronic health records and machine learning
    Liu, Guanglei
    Hu, Jiani
    Yang, Jianzhe
    Song, Jie
    [J]. PEERJ, 2024, 12
  • [39] Domain Knowledge Guided Deep Learning with Electronic Health Records
    Yin, Changchang
    Zhao, Rongjian
    Qian, Buyue
    Lv, Xin
    Zhang, Ping
    [J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 738 - 747
  • [40] Readmission prediction using deep learning on electronic health records
    Ashfaq, Awais
    Sant'Anna, Anita
    Lingman, Markus
    Nowaczyk, Slawomir
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 97