Data mining information from electronic health records produced high yield and accuracy for current smoking status

被引:18
作者
Groenhof, T. Katrien J. [1 ]
Koers, Laurien R. [1 ]
Blasse, Enja [2 ]
de Groot, Mark [2 ]
Grobbee, Diederick E. [1 ]
Bots, Michiel L. [1 ]
Asselbergs, Folkert W. [3 ,4 ,5 ]
Lely, A. Titia [6 ]
Haitjema, Saskia [2 ]
van Solinge, Wouter
Hoefer, Imo
Haitjema, Saskia [2 ]
de Groot, Mark [2 ]
Asselbergs, F. W. [7 ]
de Borst, G. J. [8 ]
Bots, M. L. [9 ,10 ]
Dieleman, S.
Emmelot, M. H. [11 ]
de Jong, P. A. [12 ]
Lely, A. T. [13 ]
Hoefer, I. E. [14 ]
van der Kaaij, N. P. [15 ]
Ruigrok, Y. M. [16 ]
Verhaar, M. C. [17 ]
Visseren, F. L. J. [18 ,19 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[2] Univ Utrecht, Univ Med Ctr Utrecht, Lab Clin Chem & Hematol, Utrecht, Netherlands
[3] UCL, Inst Hlth Informat, Hlth Data Res UK, London, England
[4] Univ Utrecht, Univ Med Ctr Utrecht, Div Heart & Lungs, Dept Cardiol, Utrecht, Netherlands
[5] UCL, Fac Populat Hlth Sci, Inst Cardiovasc Sci, London, England
[6] Univ Utrecht, Univ Med Ctr Utrecht, Wilhelmina Childrens Hosp, Dept Obstet, Utrecht, Netherlands
[7] Univ Med Ctr Utrecht, Dept Cardiol, Utrecht, Netherlands
[8] Univ Med Ctr Utrecht, Dept Vasc Surg, Utrecht, Netherlands
[9] Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[10] Univ Med Ctr Utrecht, Div Vital Funct Anesthesiol & Intens Care, Utrecht, Netherlands
[11] Univ Med Ctr Utrecht, Dept Geriatr, Utrecht, Netherlands
[12] Univ Med Ctr Utrecht, Dept Radiol, Utrecht, Netherlands
[13] Univ Med Ctr Utrecht, Dept Obstet Gynecol, Utrecht, Netherlands
[14] Univ Med Ctr Utrecht, Lab Clin Chem & Hematol, Utrecht, Netherlands
[15] Univ Med Ctr Utrecht, Dept Cardiothorac Surg, Utrecht, Netherlands
[16] Univ Med Ctr Utrecht, Dept Neurol, Utrecht, Netherlands
[17] Univ Med Ctr Utrecht, Dept Hypertens & Nephrol, Utrecht, Netherlands
[18] Univ Med Ctr Utrecht, Dept Vasc Med, Utrecht, Netherlands
[19] Univ Utrecht, Utrecht, Netherlands
关键词
Data mining; Electronic health records; Routine clinical data; Learning healthcare system; Data quality; Text mining; EVENTS; RISK; CARE;
D O I
10.1016/j.jclinepi.2019.11.006
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objectives: Researchers are increasingly using routine clinical data for care evaluations and feedback to patients and clinicians. The quality of these evaluations depends on the quality and completeness of the input data. Study Design and Setting: We assessed the performance of an electronic health record (EHR)-based data mining algorithm, using the example of the smoking status in a cardiovascular population. As a reference standard, we used the questionnaire from the Utrecht Cardiovascular Cohort (UCC). To assess diagnostic accuracy, we calculated sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). Results: We analyzed 1,661 patients included in the UCC to January 18, 2019. Of those, 14% (n = 238) had missing information on smoking status in the UCC questionnaire. Data mining provided information on smoking status in 99% of the 1,661 participants. Diagnostic accuracy for current smoking was sensitivity 88%, specificity 92%, NPV 98%, and PPV 63%. From false positives, 85% reported they had quit smoking at the time of the UCC. Conclusion: Data mining showed great potential in retrieving information on smoking (a near complete yield). Its diagnostic performance is good for negative smoking statuses. The implications of misclassihcation with data mining are dependent on the application of the data. (C) 2019 The Authors. Published by Elsevier Inc.
引用
收藏
页码:100 / 106
页数:7
相关论文
共 23 条
  • [1] [Anonymous], TIDYTEXT
  • [2] [Anonymous], TEXT MINING SUPPORT
  • [3] Uniform data collection in routine clinical practice in cardiovascular patients for optimal care, quality control and research: The Utrecht Cardiovascular Cohort
    Asselbergs, Folkert W.
    Visseren, Frank L. J.
    Bots, Michiel L.
    de Borst, Gert J.
    Buijsrogge, Marc P.
    Dieleman, Jan M.
    van Dinther, Baukje G. F.
    Doevendans, Pieter A.
    Hoefer, Imo E.
    Hollander, Monika
    de Jong, Pim A.
    Koenen, Steven V.
    Pasterkamp, Gerard
    Ruigrok, Ynte M.
    van der Schouw, Yvonne T.
    Verhaar, Marianne C.
    Grobbee, Diederick E.
    [J]. EUROPEAN JOURNAL OF PREVENTIVE CARDIOLOGY, 2017, 24 (08) : 840 - 847
  • [4] Development of an algorithm for determining smoking status and behaviour over the life course from UK electronic primary care records
    Atkinson, Mark D.
    Kennedy, Jonathan I.
    John, Ann
    Lewis, Keir E.
    Lyons, Ronan A.
    Brophy, Sinead T.
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
  • [5] Decline in risk of recurrent cardiovascular events in the period 1996 to 2014 partly explained by better treatment of risk factors and less subclinical atherosclerosis
    Berkelmans, Gijs F. N.
    van der Graaf, Yolanda
    Dorresteijn, Jannick A. N.
    de Borst, Gert Jan
    Cramer, Maarten J.
    Kappelle, L. Jaap
    Westerink, Jan
    Visseren, Frank L. J.
    [J]. INTERNATIONAL JOURNAL OF CARDIOLOGY, 2018, 251 : 96 - 102
  • [6] A Human(e) Factor in Clinical Decision Support Systems
    Bezemer, Tim
    de Groot, Mark C. H.
    Blasse, Enja
    ten Berg, Maarten J.
    Kappen, Teus H.
    Bredenoord, Annelien L.
    van Solinge, Wouter W.
    Hoefer, Imo E.
    Haitjema, Saskia
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2019, 21 (03)
  • [7] The Learning Healthcare System: Where are we now? A systematic review
    Budrionis, Andrius
    Bellika, Johan Gustav
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 : 87 - 92
  • [8] Development and validation of a prediction rule for recurrent vascular events based on a cohort study of patients with arterial disease: the SMART risk score
    Dorresteijn, Johannes A. N.
    Visseren, Frank L. J.
    Wassink, Annemarie M. J.
    Gondrie, Martijn J. A.
    Steyerberg, Ewout W.
    Ridker, Paul M.
    Cook, Nancy R.
    van der Graaf, Yolanda
    [J]. HEART, 2013, 99 (12) : 866 - 872
  • [9] Foley T., 2015, The Potential of Learning Healthcare Systems
  • [10] Extracting information from the text of electronic medical records to improve case detection: a systematic review
    Ford, Elizabeth
    Carroll, John A.
    Smith, Helen E.
    Scott, Donia
    Cassell, Jackie A.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (05) : 1007 - 1015