Examining the impact of data quality and completeness of electronic health records on predictions of patients' risks of cardiovascular disease

被引:17
作者
Li, Yan [1 ]
Sperrin, Matthew [1 ]
Martin, Glen P. [1 ]
Ashcroft, Darren M. [2 ,3 ]
van Staa, Tjeerd Pieter [1 ,4 ,5 ]
机构
[1] Univ Manchester, Manchester Acad Hlth Sci Ctr, Hlth E Res Ctr, Fac Biol Med & Hlth,Farr Inst,Sch Hlth Sci, Oxford Rd, Manchester M13 9PL, Lancs, England
[2] Univ Manchester, Fac Biol Med & Hlth, Sch Hlth Sci, Ctr Pharmacoepidemiol & Drug Safety, Oxford Rd, Manchester M13 9PL, Lancs, England
[3] Univ Manchester, Fac Biol Med & Hlth, Sch Hlth Sci, NIHR Greater Manchester Patient Safety Translat R, Oxford Rd, Manchester M13 9PL, Lancs, England
[4] Univ Utrecht, Utrecht Inst Pharmaceut Sci, Utrecht, Netherlands
[5] British Lib, Alan Turing Inst, London, England
关键词
Electronic health records; QRISK; Practice variability; Statistical frailty model; CVD risk prediction; Random slope model; 10-YEAR RISK; MODELS;
D O I
10.1016/j.ijmedinf.2019.104033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: To assess the extent of variation of data quality and completeness of electronic health records and impact on the robustness of risk predictions of incident cardiovascular disease (CVD) using a risk prediction tool that is based on routinely collected data (QRISK3). Design: Longitudinal cohort study. Settings: 392 general practices (including 3.6 million patients) linked to hospital admission data. Methods: Variation in data quality was assessed using Saez's stability metrics quantifying outlyingness of each practice. Statistical frailty models evaluated whether accuracy of QRISK3 predictions on individual predictions and effects of overall risk factors (linear predictor) varied between practices. Results: There was substantial heterogeneity between practices in CVD incidence unaccounted for by QRISK3. In the lowest quintile of statistical frailty, a QRISK3 predicted risk of 10 % for female was in a range between 7.1 % and 9.0 % when incorporating practice variability into the statistical frailty models; for the highest quintile, this was 10.9%-16.4%. Data quality (using Saez metrics) and completeness were comparable across different levels of statistical frailty. For example, recording of missing information on ethnicity was 55.7 %, 62.7 %, 57.8 %, 64.8 % and 62.1 % for practices from lowest to highest quintiles of statistical frailty respectively. The effects of risk factors did not vary between practices with little statistical variation of beta coefficients. Conclusions: The considerable unmeasured heterogeneity in CVD incidence between practices was not explained by variations in data quality or effects of risk factors. QRISK3 risk prediction should be supplemented with clinical judgement and evidence of additional risk factors.
引用
收藏
页数:9
相关论文
共 50 条
[21]   A method for cohort selection of cardiovascular disease records from an electronic health record system [J].
Fernandes Abrahao, Maria Tereza ;
Cuce Nobre, Moacyr Roberto ;
Gutierrez, Marco Antonio .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2017, 102 :138-149
[22]   Using Electronic Health Records and Machine Learning to Make Medical-Related Predictions from Non-Medical Data [J].
Pitoglou, Stavros ;
Koumpouros, Yiannis ;
Anastasiou, Athanasios .
2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, :56-60
[23]   Machine Learning Analysis for Data Incompleteness (MADI): Analyzing the Data Completeness of Patient Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records [J].
Gurupur, Varadraj P. ;
Shelleh, Muhammed .
IEEE ACCESS, 2021, 9 :95994-96001
[24]   Profiling genetic variants in cardiovascular disease genes among a Heterogeneous cohort of Mendelian conditions patients and electronic health records [J].
Akawi, Nadia ;
Al Mansoori, Ghadeera ;
Al Zaabi, Anwar ;
Badics, Andrea ;
Al Dhaheri, Noura ;
Al Shamsi, Aisha ;
Al Tenaiji, Amal ;
Alzohily, Bashar ;
Almesmari, Fatmah S. A. ;
Al Hammadi, Hamad ;
Al Dhahouri, Nahid ;
Irshaid, Manal ;
Kizhakkedath, Praseetha ;
Al Shibli, Fatema ;
Tabouni, Mohammed ;
Allam, Mushal ;
Baydoun, Ibrahim ;
Alblooshi, Hiba ;
Ali, Bassam R. ;
Foo, Roger S. ;
Al Jasmi, Fatma .
FRONTIERS IN MOLECULAR BIOSCIENCES, 2024, 11
[25]   Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records [J].
TSANG, G. A. V. I. N. ;
ZHOU, SHANG-MING ;
XIE, X. I. A. N. G. H. U. A. .
IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2021, 9 :1-13
[26]   Impact of the implementation of electronic health records on the quality of discharge summaries and on the coding of hospitalization episodes [J].
Bernal, Jose L. ;
DelBusto, Sebastian ;
Garcia-Manoso, Maria, I ;
de Castro Monteiro, Emilia ;
Moreno, Angel ;
Varela-Rodriguez, Carolina ;
Ruiz-Lopez, Pedro M. .
INTERNATIONAL JOURNAL FOR QUALITY IN HEALTH CARE, 2018, 30 (08) :630-636
[27]   Factors influencing the quality of vital sign data in electronic health records: A qualitative study [J].
Stevenson, Jean E. ;
Israelsson, Johan ;
Petersson, Goran ;
Bath, Peter A. .
JOURNAL OF CLINICAL NURSING, 2018, 27 (5-6) :1276-1286
[28]   The Impact of Electronic Health Records on the Duration of Patients' Visits: Time and Motion Study [J].
Jabour, Abdulrahman Mohammed .
JMIR MEDICAL INFORMATICS, 2020, 8 (02)
[29]   Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data [J].
Ross, Elsie Gyang ;
Jung, Kenneth ;
Dudley, Joel T. ;
Li, Li ;
Leeper, Nicholas J. ;
Shah, Nigam H. .
CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2019, 12 (03)
[30]   Deep Learning Analysis of Polish Electronic Health Records for Diagnosis Prediction in Patients with Cardiovascular Diseases [J].
Anetta, Kristof ;
Horak, Ales ;
Wojakowski, Wojciech ;
Wita, Krystian ;
Jadczyk, Tomasz .
JOURNAL OF PERSONALIZED MEDICINE, 2022, 12 (06)