Unsupervised Learning to Subphenotype Heart Failure Patients from Electronic Health Records

被引:1
|
作者
Hackl, Melanie [1 ]
Datta, Suparno [1 ,2 ]
Miotto, Riccardo [2 ]
Bottinger, Erwin [1 ,2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
来源
ARTIFICIAL INTELLIGENCE IN MEDICINE (AIME 2021) | 2021年
关键词
Unsupervised learning; Electronic health records; Heart failure;
D O I
10.1007/978-3-030-77211-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heart failure (HF) is a deadly disease and its prevalence is slowly increasing. The sub-types of HF are currently mostly determined by the so-called ejection fraction (EF). In this work, we try to find novel subgroups of heart failure following a complete data-driven approach of clustering patients based on their electronic health records (EHRs). Using a validated phenotyping algorithm we were able to identify 14,334 adult patients with heart failure in our database. We derived embeddings of patients using two different strategies, one processing aggregated clinical features using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP), and one where we learn embeddings from the sequence of medical events using a long short-term memory (LSTM) autoencoder. Then we evaluated different clustering strategies like k-means and agglomerative hierarchical to derive the most informative subtypes. The results were compared based on different metrics such as silhouette coefficient and so on and also based on comparing outcomes such as hospitalization, EF etc. between the clusters. In the most promising result, we were able to identify 3 subclusters using the aggregated data approach in combination with UMAP as dimension reduction method and k-means as cluster method. Patients in cluster 1 had the lowest number of hospital days and comorbidities, while patients in cluster 3 had a significantly higher number of hospital days together with a higher prevalence of comorbidities such as chronic kidney disease and atrial fibrillation. Patients in cluster 2 had a high prevalence of drug allergies in their medical history.
引用
收藏
页码:219 / 228
页数:10
相关论文
共 50 条
  • [41] Predicting opioid dependence from electronic health records with machine learning
    Ellis, Randall J.
    Wang, Zichen
    Genes, Nicholas
    Ma'ayan, Avi
    BIODATA MINING, 2019, 12 (1)
  • [42] A machine learning approach to identifying delirium from electronic health records
    Kim, Jae Hyun
    Hua, May
    Whittington, Robert A.
    Lee, Junghwan
    Liu, Cong
    Ta, Casey N.
    Marcantonio, Edward R.
    Goldberg, Terry E.
    Weng, Chunhua
    JAMIA OPEN, 2022, 5 (02)
  • [43] Predicting opioid dependence from electronic health records with machine learning
    Randall J. Ellis
    Zichen Wang
    Nicholas Genes
    Avi Ma’ayan
    BioData Mining, 12
  • [44] Deep Learning for Electronic Health Records Analytics
    Harerimana, Gaspard
    Kim, Jong Wook
    Yoo, Hoon
    Jang, Beakcheol
    IEEE ACCESS, 2019, 7 : 101245 - 101259
  • [45] Seasonality of acute kidney injury phenotypes in England: an unsupervised machine learning classification study of electronic health records
    Bolt, Hikaru
    Suffel, Anne
    Matthewman, Julian
    Sandmann, Frank
    Tomlinson, Laurie
    Eggo, Rosalind
    BMC NEPHROLOGY, 2023, 24 (01)
  • [46] Identifying novel subgroups in heart failure patients with unsupervised machine learning: A scoping review
    Sun, Jin
    Guo, Hua
    Wang, Wenjun
    Wang, Xiao
    Ding, Junyu
    He, Kunlun
    Guan, Xizhou
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [47] Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records
    Geir Thore Berge
    Ole-Christoffer Granmo
    Tor Oddbjørn Tveit
    Anna Linda Ruthjersen
    Jivitesh Sharma
    BMC Medical Informatics and Decision Making, 23
  • [48] Mining tasks and task characteristics from electronic health record audit logs with unsupervised machine learning
    Chen, Bob
    Alrifai, Wael
    Gao, Cheng
    Jones, Barrett
    Novak, Laurie
    Lorenzi, Nancy
    France, Daniel
    Malin, Bradley
    Chen, You
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (06) : 1168 - 1177
  • [49] Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records
    Berge, Geir Thore
    Granmo, Ole-Christoffer
    Tveit, Tor Oddbjorn
    Ruthjersen, Anna Linda
    Sharma, Jivitesh
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [50] Prediction of Incident Heart Failure With Data From Electronic Medical Records in a Community Hospital
    Wang, Weijia
    Chen, Jingsha
    Ballew, Shoshana
    Coresh, Josef
    Twigg, Allen
    Matsushita, Kunihiro
    CIRCULATION, 2017, 136