Coronary artery disease risk assessment from unstructured electronic health records using text mining

被引:51
作者
Jonnagaddala, Jitendra [1 ,2 ,3 ]
Liaw, Siaw-Teng [1 ]
Ray, Pradeep [2 ]
Kumar, Manish [3 ]
Chang, Nai-Wen [4 ,5 ]
Dai, Hong-Jie [6 ]
机构
[1] Univ New South Wales, Sch Publ Hlth & Community Med, Sydney, NSW, Australia
[2] Univ New South Wales, Asia Pacific Ubiquitous Healthcare Res Ctr, Sydney, NSW, Australia
[3] Univ New South Wales, Prince Wales Clin Sch, Sydney, NSW, Australia
[4] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
[5] Natl Taiwan Univ, Grad Inst Biomed Elect & Bioinformat, Taipei, Taiwan
[6] Natl Taitung Univ, Dept Comp Sci & Informat Engn, Taitung, Taiwan
关键词
Coronary artery disease; Text mining; Framingham risk score; Temporal data; EHR;
D O I
10.1016/j.jbi.2015.08.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Coronary artery disease (CAD) often leads to myocardial infarction, which may be fatal. Risk factors can be used to predict CAD, which may subsequently lead to prevention or early intervention. Patient data such as co-morbidities, medication history, social history and family history are required to determine the risk factors for a disease. However, risk factor data are usually embedded in unstructured clinical narratives if the data is not collected specifically for risk assessment purposes. Clinical text mining can be used to extract data related to risk factors from unstructured clinical notes. This study presents methods to extract Framingham risk factors from unstructured electronic health records using clinical text mining and to calculate 10-year coronary artery disease risk scores in a cohort of diabetic patients. We developed a rule-based system to extract risk factors: age, gender, total cholesterol, HDL-C, blood pressure, diabetes history and smoking history. The results showed that the output from the text mining system was reliable, but there was a significant amount of missing data to calculate the Framingham risk score. A systematic approach for understanding missing data was followed by implementation of imputation strategies. An analysis of the 10-year Framingham risk scores for coronary artery disease in this cohort has shown that the majority of the diabetic patients are at moderate risk of CAD. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:S203 / S210
页数:8
相关论文
共 38 条
[1]  
[Anonymous], LANCET
[2]   Cardiovascular risk assessment scores for people with diabetes: a systematic review [J].
Chamnan, P. ;
Simmons, R. K. ;
Sharp, S. J. ;
Griffin, S. J. ;
Wareham, N. J. .
DIABETOLOGIA, 2009, 52 (10) :2001-2014
[3]   A context-aware approach for progression tracking of medical concepts in electronic medical records [J].
Chang, Nai-Wen ;
Dai, Hong-Jie ;
Jonnagaddala, Jitendra ;
Chen, Chih-Wei ;
Tsai, Richard Tzong-Han ;
Hsu, Wen-Lian .
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 :S150-S157
[4]   General cardiovascular risk profile for use in primary care - The Framingham Heart Study [J].
D'Agostino, Ralph B. ;
Vasan, Ramachandran S. ;
Pencina, Michael J. ;
Wolf, Philip A. ;
Cobain, Mark ;
Massaro, Joseph M. ;
Kannel, William B. .
CIRCULATION, 2008, 117 (06) :743-753
[5]   Prediction of hospitalization due to heart diseases by supervised learning methods [J].
Dai, Wuyang ;
Brisimi, Theodora S. ;
Adams, William G. ;
Mela, Theofanie ;
Saligrama, Venkatesh ;
Paschalidis, Ioannis Ch. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2015, 84 (03) :189-197
[6]  
Esdaile JM, 2001, ARTHRITIS RHEUM, V44, P2331, DOI 10.1002/1529-0131(200110)44:10<2331::AID-ART395>3.0.CO
[7]  
2-I
[8]   The distribution of 10-year risk for coronary heart disease among US adults - Findings from the National Health and Nutrition Examination Survey III [J].
Ford, ES ;
Giles, WH ;
Mokdad, AH .
JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2004, 43 (10) :1791-1796
[9]   A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data [J].
Gotz, David ;
Wang, Fei ;
Perer, Adam .
JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 48 :148-159
[10]   Using Body Mass Index Data in the Electronic Health Record to Calculate Cardiovascular Risk [J].
Green, Beverly B. ;
Anderson, Melissa L. ;
Cook, Andrea J. ;
Catz, Sheryl ;
Fishman, Paul A. ;
McClure, Jennifer B. ;
Reid, Robert .
AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2012, 42 (04) :342-347