Prediction of hospitalization due to heart diseases by supervised learning methods

被引:78
作者
Dai, Wuyang [1 ,2 ]
Brisimi, Theodora S. [1 ,2 ]
Adams, William G. [3 ,4 ]
Mela, Theofanie [5 ]
Saligrama, Venkatesh [1 ,2 ]
Paschalidis, Ioannis Ch. [1 ,2 ]
机构
[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
[2] Boston Univ, Div Syst Engn, Boston, MA 02215 USA
[3] Boston Univ, Sch Med, Dept Pediat, Boston, MA 02118 USA
[4] Boston Med Ctr, Boston, MA 02118 USA
[5] Massachusetts Gen Hosp, Arrhythmia Serv, Electrophysiol Lab, Boston, MA 02114 USA
基金
美国国家科学基金会;
关键词
Prevention; Predictive models; Hospitalization; Heart diseases; Machine learning; Electronic Health Records (EHRs); ELECTRONIC MEDICAL-RECORDS; PRIMARY-CARE; RISK;
D O I
10.1016/j.ijmedinf.2014.10.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: In 2008, the United States spent $2.2 trillion for healthcare, which was 15.5% of its GDP. 31% of this expenditure is attributed to hospital care. Evidently, even modest reductions in hospital care costs matter. A 2009 study showed that nearly $30.8 billion in hospital care cost during 2006 was potentially preventable, with heart diseases being responsible for about 31% of that amount. Methods: Our goal is to accurately and efficiently predict heart-related hospitalizations based on the available patient-specific medical history. To the best of our knowledge, the approaches we introduce are novel for this problem. The prediction of hospitalization is formulated as a supervised classification problem. We use de-identified Electronic Health Record (EHR) data from a large urban hospital in Boston to identify patients with heart diseases. Patients are labeled and randomly partitioned into a training and a test set. We apply five machine learning algorithms, namely Support Vector Machines (SVM), AdaBoost using trees as the weak learner, logistic regression, a naive Bayes event classifier, and a variation of a Likelihood Ratio Test adapted to the specific problem. Each model is trained on the training set and then tested on the test set. Results: All five models show consistent results, which could, to some extent, indicate the limit of the achievable prediction accuracy. Our results show that with under 30% false alarm rate, the detection rate could be as high as 82%. These accuracy rates translate to a considerable amount of potential savings, if used in practice. (C) 2014 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:189 / 197
页数:9
相关论文
共 21 条
[1]  
Agarwal J., 2012, THESIS U WASHINGTON
[2]   Algorithmic Prediction of Health-Care Costs [J].
Bertsimas, Dimitris ;
Bjarnadottir, Margret V. ;
Kane, Michael A. ;
Kryder, J. Christian ;
Pandey, Rudra ;
Vempala, Santosh ;
Wang, Grant .
OPERATIONS RESEARCH, 2008, 56 (06) :1382-1392
[3]  
Bishop Christopher, 2006, Pattern Recognition and Machine Learning, DOI 10.1117/1.2819119
[4]   Using EHR data to predict hospital-acquired pressure ulcers: A prospective study of a Bayesian Network model [J].
Cho, Insook ;
Park, Ihnsook ;
Kim, Eunman ;
Lee, Eunjoon ;
Bates, David W. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2013, 82 (11) :1059-1067
[5]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[6]   General cardiovascular risk profile for use in primary care - The Framingham Heart Study [J].
D'Agostino, Ralph B. ;
Vasan, Ramachandran S. ;
Pencina, Michael J. ;
Wolf, Philip A. ;
Cobain, Mark ;
Massaro, Joseph M. ;
Kannel, William B. .
CIRCULATION, 2008, 117 (06) :743-753
[7]  
Freund Y., 1999, Journal of Japanese Society for Artificial Intelligence, V14, P771
[8]   Hospitalization Epidemic in Patients With Heart Failure: Risk Factors, Risk Prediction, Knowledge Gaps, and Future Directions [J].
Giamouzis, Gregory ;
Kalogeropoulos, Andreas ;
Georgiopoulou, Vasiliki ;
Laskar, Sonjoy ;
Smith, Andrew L. ;
Dunbar, Sandra ;
Triposkiadis, Filippos ;
Butler, Javed .
JOURNAL OF CARDIAC FAILURE, 2011, 17 (01) :54-75
[10]  
Hastie T., 2009, The elements of statistical learning: data mining, inference, and pre- diction, V2nd ed