Unsupervised Learning to Subphenotype Heart Failure Patients from Electronic Health Records

被引:1
|
作者
Hackl, Melanie [1 ]
Datta, Suparno [1 ,2 ]
Miotto, Riccardo [2 ]
Bottinger, Erwin [1 ,2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
来源
ARTIFICIAL INTELLIGENCE IN MEDICINE (AIME 2021) | 2021年
关键词
Unsupervised learning; Electronic health records; Heart failure;
D O I
10.1007/978-3-030-77211-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heart failure (HF) is a deadly disease and its prevalence is slowly increasing. The sub-types of HF are currently mostly determined by the so-called ejection fraction (EF). In this work, we try to find novel subgroups of heart failure following a complete data-driven approach of clustering patients based on their electronic health records (EHRs). Using a validated phenotyping algorithm we were able to identify 14,334 adult patients with heart failure in our database. We derived embeddings of patients using two different strategies, one processing aggregated clinical features using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP), and one where we learn embeddings from the sequence of medical events using a long short-term memory (LSTM) autoencoder. Then we evaluated different clustering strategies like k-means and agglomerative hierarchical to derive the most informative subtypes. The results were compared based on different metrics such as silhouette coefficient and so on and also based on comparing outcomes such as hospitalization, EF etc. between the clusters. In the most promising result, we were able to identify 3 subclusters using the aggregated data approach in combination with UMAP as dimension reduction method and k-means as cluster method. Patients in cluster 1 had the lowest number of hospital days and comorbidities, while patients in cluster 3 had a significantly higher number of hospital days together with a higher prevalence of comorbidities such as chronic kidney disease and atrial fibrillation. Patients in cluster 2 had a high prevalence of drug allergies in their medical history.
引用
收藏
页码:219 / 228
页数:10
相关论文
共 50 条
  • [31] Patient-oriented unsupervised learning to uncover the patterns of multimorbidity associated with stroke using primary care electronic health records
    Delord, Marc
    Sun, Xiaohui
    Learoyd, Annastazia
    Curcin, Vasa
    Wolfe, Charles
    Ashworth, Mark
    Douiri, Abdel
    BMC PRIMARY CARE, 2024, 25 (01):
  • [32] Using Unsupervised Machine Learning to Identify Subgroups Among Home Health Patients With Heart Failure Using Telehealth
    Bose, Eliezer
    Radhakrishnan, Kavita
    CIN-COMPUTERS INFORMATICS NURSING, 2018, 36 (05) : 242 - 248
  • [33] Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records
    Karlsson, Isak
    Papapetrou, Panagiotis
    Asker, Lars
    Bostrom, Henrik
    Persson, Hans E.
    10TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2017), 2017, : 394 - 398
  • [34] Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records
    Zhang, Jingqing
    Zhang, Xiaoyu
    Sun, Kai
    Yang, Xian
    Dai, Chengliang
    Guo, Yike
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 598 - 603
  • [35] Unsupervised machine learning for the discovery of latent disease clusters and patient subgroups using electronic health records
    Wang, Yanshan
    Zhao, Yiqing
    Therneau, Terry M.
    Atkinson, Elizabeth J.
    Tafti, Ahmad P.
    Zhang, Nan
    Amin, Shreyasee
    Limper, Andrew H.
    Khosla, Sundeep
    Liu, Hongfang
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 102
  • [36] A Multi-Task Neural Network Architecture for Renal Dysfunction Prediction in Heart Failure Patients With Electronic Health Records
    Wang, Binhua
    Bai, Yongyi
    Yao, Zhenjie
    Li, Jiangong
    Dong, Wei
    Tu, Yanhui
    Xue, Wanguo
    Tian, Yaping
    Wang, Yifei
    He, Kunlun
    IEEE ACCESS, 2019, 7 : 178392 - 178400
  • [37] Deep Learning Analysis of Polish Electronic Health Records for Diagnosis Prediction in Patients with Cardiovascular Diseases
    Anetta, Kristof
    Horak, Ales
    Wojakowski, Wojciech
    Wita, Krystian
    Jadczyk, Tomasz
    JOURNAL OF PERSONALIZED MEDICINE, 2022, 12 (06):
  • [38] Phenotyping SubPopulations of Heart Failure Patients Based on Clinical and Social Determinants of Health Using Unsupervised Machine Learning Models
    Spinelli, Kateri J.
    Li, Hsin Fang
    Rider, Deanna
    Abraham, Jacob
    Spatz, Erica S.
    Huang, Xiaoyan
    CIRCULATION, 2023, 148
  • [39] Unsupervised Deep Learning of Electronic Health Records to Characterize Heterogeneity Across Alzheimer Disease and Related Dementias: Cross-Sectional Study
    West, Matthew
    Cheng, You
    He, Yingnan
    Leng, Yu
    Magdamo, Colin
    Hyman, Bradley
    Dickson, John R.
    Serrano-Pozo, Alberto
    Blacker, Deborah
    Das, Sudeshna
    JMIR AGING, 2025, 8
  • [40] Learning to Identify Severe Maternal Morbidity from Electronic Health Records
    Gao, Cheng
    Osmundson, Sarah
    Yan, Xiaowei
    Edwards, Digna Velez
    Malin, Bradley A.
    Chen, You
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 143 - 147