Data quality assessment framework to assess electronic medical record data for use in research

被引:48
作者
Reimer, Andrew P. [1 ,2 ]
Milinovich, Alex [2 ]
Madigan, Elizabeth A. [1 ]
机构
[1] Case Western Reserve Univ, Frances Payne Bolton Sch Nursing, 10900 Euclid Ave, Cleveland, OH 44106 USA
[2] Cleveland Clin, 10900 Euclid Ave, Cleveland, OH 44195 USA
基金
美国国家卫生研究院;
关键词
Electronic medical records; Evaluation & assessment; Information storage; Retrieval & integration;
D O I
10.1016/j.ijmedinf.2016.03.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction: The proliferation and use of electronic medical records (EMR) in the clinical setting now provide a rich source of clinical data that can be leveraged to support research on patient outcomes, comparative effectiveness, and health systems research. Once the large volume and variety of data that robust clinical EMRs provide is aggregated, the suitability of the data for research purposes must be addressed. Therefore, the purpose of this paper is two-fold. First, we present a stepwise framework capable of guiding initial data quality assessment when matching multiple data sources regardless of context or application. Then, we demonstrate a use case of initial analysis of a longitudinal data repository of electronic health record data that illustrates the first four steps of the framework, and report results. Methods: A six-step data quality assessment framework is proposed and described that includes the following data quality assessment steps: (1) preliminary analysis, (2) documentation-longitudinal concordance, (3) breadth, (4) data element presence, (5) density, and (6) prediction. The six-step framework was applied to the Transport Data Mart a data repository that contains over 28,000 records for patients that underwent interhospital transfer that includes EMRs from the sending hospitalization, transport, and receiving hospitalization. Results: There were a total of 9557 log entries of which 8139 were successfully matched to corresponding hospital encounters. 2832 were successfully mapped to both the sending and receiving hospital encounters (resulting in a 93% automatic matching rate), with 590 including air medical transport EMR data representing a complete case for testing. Results from Step 2 indicate that once records are identified and matched, there appears to be relatively limited drop-off of additional records when the criteria for matching increases, indicating the a proportion of records consistently contain nearly complete data. Measures of central tendency used in Step 3 and 4 exhibit a right skewness suggesting that a small proportion of records contain the highest number of repeated measures for the measured variables. Conclusions: The proposed six-step data quality assessment framework is useful in establishing the meta data for a longitudinal data repository that can be replicated by other studies. There are practical issues that need to be addressed including the data quality assessments with the most prescient being the need to establish data quality metrics for benchmarking acceptable levels of EMR data inclusiveness through testing and application. (C) 2016 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:40 / 47
页数:8
相关论文
共 20 条
  • [1] Data quality probes-exploiting and improving the quality of electronic patient record data and patient care
    Brown, PJB
    Warmington, V
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2002, 68 (1-3) : 91 - 98
  • [2] Feature mining and predictive model construction from severe trauma patient's data
    Demsar, J
    Zupan, B
    Aoki, N
    Wall, MJ
    Granchi, TH
    Beck, JR
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2001, 63 (1-2) : 41 - 50
  • [3] Faulconer, 2004, INFORM PRIMARY CARE, V12, P243
  • [4] Grannis S.J., 2003, AMIA ANN S P, V2003, P259
  • [5] Hersh W.R., 2013, RECOMMENDATIONS USE, V1, P1018
  • [6] Caveats for the Use of Operational Electronic Health Record Data in Comparative Effectiveness Research
    Hersh, William R.
    Weiner, Mark G.
    Embi, Peter J.
    Logan, Judith R.
    Payne, Philip R. O.
    Bernstam, Elmer V.
    Lehmann, Harold P.
    Hripcsak, George
    Hartzog, Timothy H.
    Cimino, James J.
    Saltz, Joel H.
    [J]. MEDICAL CARE, 2013, 51 (08) : S30 - S37
  • [7] Interhospital transfer of critically ill patients: Demographic and outcomes comparison with nontransferred intensive care unit patients
    Hill, Andrea D.
    Vingilis, Evelyn
    Martin, Claudio M.
    Hartford, Kathleen
    Speechley, Kathy N.
    [J]. JOURNAL OF CRITICAL CARE, 2007, 22 (04) : 290 - 295
  • [8] I.I.o. Medicine, 2011, DIG INFR LEARN HLTH
  • [9] A computational framework to identify patients with poor adherence to blood pressure lowering medication
    Mabotuwana, Thusitha
    Warren, Jim
    Kennelly, John
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2009, 78 (11) : 745 - 756
  • [10] Improving record linkage performance in the presence of missing linkage data
    Ong, Toan C.
    Mannino, Michael V.
    Schilling, Lisa M.
    Kahn, Michael G.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 52 : 43 - 54