Unsupervised Learning to Subphenotype Heart Failure Patients from Electronic Health Records

被引:1
|
作者
Hackl, Melanie [1 ]
Datta, Suparno [1 ,2 ]
Miotto, Riccardo [2 ]
Bottinger, Erwin [1 ,2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
来源
ARTIFICIAL INTELLIGENCE IN MEDICINE (AIME 2021) | 2021年
关键词
Unsupervised learning; Electronic health records; Heart failure;
D O I
10.1007/978-3-030-77211-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heart failure (HF) is a deadly disease and its prevalence is slowly increasing. The sub-types of HF are currently mostly determined by the so-called ejection fraction (EF). In this work, we try to find novel subgroups of heart failure following a complete data-driven approach of clustering patients based on their electronic health records (EHRs). Using a validated phenotyping algorithm we were able to identify 14,334 adult patients with heart failure in our database. We derived embeddings of patients using two different strategies, one processing aggregated clinical features using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP), and one where we learn embeddings from the sequence of medical events using a long short-term memory (LSTM) autoencoder. Then we evaluated different clustering strategies like k-means and agglomerative hierarchical to derive the most informative subtypes. The results were compared based on different metrics such as silhouette coefficient and so on and also based on comparing outcomes such as hospitalization, EF etc. between the clusters. In the most promising result, we were able to identify 3 subclusters using the aggregated data approach in combination with UMAP as dimension reduction method and k-means as cluster method. Patients in cluster 1 had the lowest number of hospital days and comorbidities, while patients in cluster 3 had a significantly higher number of hospital days together with a higher prevalence of comorbidities such as chronic kidney disease and atrial fibrillation. Patients in cluster 2 had a high prevalence of drug allergies in their medical history.
引用
收藏
页码:219 / 228
页数:10
相关论文
共 50 条
  • [21] Predicting heart failure in-hospital mortality by integrating longitudinal and category data in electronic health records
    Ma, Meikun
    Hao, Xiaoyan
    Zhao, Jumin
    Luo, Shijie
    Liu, Yi
    Li, Dengao
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (07) : 1857 - 1873
  • [22] Heart failure disease prediction and stratification with temporal electronic health records data using patient representation
    Liang, Ye
    Guo, Chonghui
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2023, 43 (01) : 124 - 141
  • [23] Evaluating Drug Effectiveness for Antihypertensives in Heart Failure Prognosis: Leveraging Composite Clinical Endpoints and Biomarkers from Electronic Health Records
    Chowdhury, Shaika
    Chen, Yongbin
    Ma, Xiao
    Dai, Qiying
    Yu, Yue
    Zong, Nansu
    14TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, BCB 2023, 2023,
  • [24] Machine Learning-Driven Models to Predict Prognostic Outcomes in Patients Hospitalized With Heart Failure Using Electronic Health Records: Retrospective Study
    Lv, Haichen
    Yang, Xiaolei
    Wang, Bingyi
    Wang, Shaobo
    Du, Xiaoyan
    Tan, Qian
    Hao, Zhujing
    Liu, Ying
    Yan, Jun
    Xia, Yunlong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (04)
  • [25] Predicting heart failure in-hospital mortality by integrating longitudinal and category data in electronic health records
    Meikun Ma
    Xiaoyan Hao
    Jumin Zhao
    Shijie Luo
    Yi Liu
    Dengao Li
    Medical & Biological Engineering & Computing, 2023, 61 (7) : 1857 - 1873
  • [26] A Method for Improving the Identification of Heart Failure Patients for Quantitative Clinical Performance Measures using Electronic Health Records
    Seicean, Sinziana
    Seicean, Andreea
    Marwick, Thomas H.
    CIRCULATION, 2012, 126 (21)
  • [27] Federated Learning for Electronic Health Records
    Dang, Trung Kien
    Lan, Xiang
    Weng, Jianshu
    Feng, Mengling
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2022, 13 (05)
  • [28] Visualizing collaborative electronic health record usage for hospitalized patients with heart failure
    Soulakis, Nicholas D.
    Carson, Matthew B.
    Lee, Young Ji
    Schneider, Daniel H.
    Skeehan, Connor T.
    Scholtens, Denise M.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2015, 22 (02) : 299 - 311
  • [29] Predicting incident heart failure from population-based nationwide electronic health records: protocol for a model development and validation study
    Nakao, Yoko M.
    Nadarajah, Ramesh
    Shuweihdi, Farag
    Nakao, Kazuhiro
    Fuat, Ahmet
    Moore, Jim
    Bates, Christopher
    Wu, Jianhua
    Gale, Chris
    BMJ OPEN, 2024, 14 (01):
  • [30] Learning from heterogeneous temporal data in electronic health records
    Zhao, Jing
    Papapetrou, Panagiotis
    Asker, Lars
    Bostrom, Henrik
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 65 : 105 - 119