Prediction of hospitalization due to heart diseases by supervised learning methods

被引：78

作者：

Dai, Wuyang ^{[1
,2
]}

Brisimi, Theodora S. ^{[1
,2
]}

Adams, William G. ^{[3
,4
]}

Mela, Theofanie ^{[5
]}

Saligrama, Venkatesh ^{[1
,2
]}

Paschalidis, Ioannis Ch. ^{[1
,2
]}

机构：

[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

[2] Boston Univ, Div Syst Engn, Boston, MA 02215 USA

[3] Boston Univ, Sch Med, Dept Pediat, Boston, MA 02118 USA

[4] Boston Med Ctr, Boston, MA 02118 USA

[5] Massachusetts Gen Hosp, Arrhythmia Serv, Electrophysiol Lab, Boston, MA 02114 USA

来源：

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS | 2015年 / 84卷 / 03期

基金：

美国国家科学基金会;

关键词：

Prevention; Predictive models; Hospitalization; Heart diseases; Machine learning; Electronic Health Records (EHRs); ELECTRONIC MEDICAL-RECORDS; PRIMARY-CARE; RISK;

D O I：

10.1016/j.ijmedinf.2014.10.002

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Background: In 2008, the United States spent $2.2 trillion for healthcare, which was 15.5% of its GDP. 31% of this expenditure is attributed to hospital care. Evidently, even modest reductions in hospital care costs matter. A 2009 study showed that nearly $30.8 billion in hospital care cost during 2006 was potentially preventable, with heart diseases being responsible for about 31% of that amount. Methods: Our goal is to accurately and efficiently predict heart-related hospitalizations based on the available patient-specific medical history. To the best of our knowledge, the approaches we introduce are novel for this problem. The prediction of hospitalization is formulated as a supervised classification problem. We use de-identified Electronic Health Record (EHR) data from a large urban hospital in Boston to identify patients with heart diseases. Patients are labeled and randomly partitioned into a training and a test set. We apply five machine learning algorithms, namely Support Vector Machines (SVM), AdaBoost using trees as the weak learner, logistic regression, a naive Bayes event classifier, and a variation of a Likelihood Ratio Test adapted to the specific problem. Each model is trained on the training set and then tested on the test set. Results: All five models show consistent results, which could, to some extent, indicate the limit of the achievable prediction accuracy. Our results show that with under 30% false alarm rate, the detection rate could be as high as 82%. These accuracy rates translate to a considerable amount of potential savings, if used in practice. (C) 2014 Elsevier Ireland Ltd. All rights reserved.

引用

页码：189 / 197

页数：9

共 21 条

[21] A cost-benefit analysis of electronic medical records in primary care [J].

Wang, SJ ;

Middleton, B ;

Prosser, LA ;

Bardon, CG ;

Spurr, CD ;

Carchidi, PJ ;

Kittler, AF ;

Goldszer, RC ;

Fairchild, DG ;

Sussman, AJ ;

Kuperman, GJ ;

Bates, DW .

AMERICAN JOURNAL OF MEDICINE, 2003, 114 (05) :397-403

← 1 2 3 →