Portability of an algorithm to identify rheumatoid arthritis in electronic health records

被引:166
作者
Carroll, Robert J. [1 ]
Thompson, Will K. [2 ]
Eyler, Anne E. [3 ]
Mandelin, Arthur M. [4 ]
Cai, Tianxi [5 ]
Zink, Raquel M. [1 ]
Pacheco, Jennifer A. [2 ]
Boomershine, Chad S. [3 ]
Lasko, Thomas A. [1 ]
Xu, Hua [1 ]
Karlson, Elizabeth W. [6 ]
Perez, Raul G. [7 ]
Gainer, Vivian S. [7 ]
Murphy, Shawn N. [7 ,8 ]
Ruderman, Eric M. [4 ]
Pope, Richard M. [4 ]
Plenge, Robert M. [6 ,9 ,10 ]
Kho, Abel Ngo [11 ]
Liao, Katherine P. [6 ]
Denny, Joshua C. [1 ,3 ]
机构
[1] Vanderbilt Univ, Sch Med, Dept Biomed Informat, Eskind Biomed Lib, Nashville, TN 37232 USA
[2] Northwestern Univ, Ctr Genet Med, Evanston, IL USA
[3] Vanderbilt Univ, Sch Med, Dept Med, Nashville, TN 37232 USA
[4] Northwestern Univ, Feinberg Sch Med, Div Rheumatol, Dept Med, Chicago, IL 60611 USA
[5] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[6] Brigham & Womens Hosp, Dept Med, Div Rheumatol Allergy & Immunol, Boston, MA 02115 USA
[7] Partners HealthCare, Res Comp, Charlestown, MA USA
[8] Massachusetts Gen Hosp, Dept Neurol, Boston, MA 02114 USA
[9] Broad Inst, Cambridge, MA USA
[10] Brigham & Womens Hosp, Div Genet, Boston, MA 02115 USA
[11] Northwestern Univ, Feinberg Sch Med, Div Gen Internal Med, Dept Med, Chicago, IL 60611 USA
关键词
MEDICAL-SCHOOL CURRICULUM; GENOME-WIDE ASSOCIATION; EXTRACTION SYSTEM; ICD-9-CM CODES; DISEASE; RISK; IDENTIFICATION; INFORMATICS; ENTERPRISE; DISCOVERY;
D O I
10.1136/amiajnl-2011-000583
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives Electronic health records (EHR) can allow for the generation of large cohorts of individuals with given diseases for clinical and genomic research. A rate-limiting step is the development of electronic phenotype selection algorithms to find such cohorts. This study evaluated the portability of a published phenotype algorithm to identify rheumatoid arthritis (RA) patients from EHR records at three institutions with different EHR systems. Materials and Methods Physicians reviewed charts from three institutions to identify patients with RA. Each institution compiled attributes from various sources in the EHR, including codified data and clinical narratives, which were searched using one of two natural language processing (NLP) systems. The performance of the published model was compared with locally retrained models. Results Applying the previously published model from Partners Healthcare to datasets from Northwestern and Vanderbilt Universities, the area under the receiver operating characteristic curve was found to be 92% for Northwestern and 95% for Vanderbilt, compared with 97% at Partners. Retraining the model improved the average sensitivity at a specificity of 97% to 72% from the original 65%. Both the original logistic regression models and locally retrained models were superior to simple billing code count thresholds. Discussion These results show that a previously published algorithm for RA is portable to two external hospitals using different EHR systems, different NLP systems, and different target NLP vocabularies. Retraining the algorithm primarily increased the sensitivity at each site. Conclusion Electronic phenotype algorithms allow rapid identification of case populations in multiple sites with little retraining.
引用
收藏
页码:E162 / E169
页数:8
相关论文
共 39 条
  • [1] [Anonymous], 2011, R LANG ENV STAT COMP
  • [2] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [3] Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors
    Birman-Deych, E
    Waterman, AD
    Yan, Y
    Nilasena, DS
    Radford, MJ
    Gage, BF
    [J]. MEDICAL CARE, 2005, 43 (05) : 480 - 485
  • [4] The Enterprise Data Trust at Mayo Clinic: a semantically integrated warehouse of biomedical data
    Chute, Christopher G.
    Beck, Scott A.
    Fisk, Thomas B.
    Mohr, David N.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (02) : 131 - 135
  • [5] BRIEF REPORT: "Where do we teach what?" - Finding broad concepts in the medical school curriculum
    Denny, JC
    Smithers, JD
    Armstrong, B
    Spickard, A
    [J]. JOURNAL OF GENERAL INTERNAL MEDICINE, 2005, 20 (10) : 943 - 946
  • [6] Understanding medical school curriculum content using KnowledgeMap
    Denny, JC
    Smithers, JD
    Miller, RA
    Spickard, A
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2003, 10 (04) : 351 - 362
  • [7] Variants Near FOXE1 Are Associated with Hypothyroidism and Other Thyroid Conditions: Using Electronic Medical Records for Genome- and Phenome-wide Studies
    Denny, Joshua C.
    Crawford, Dana C.
    Ritchie, Marylyn D.
    Bielinski, Suzette J.
    Basford, Melissa A.
    Bradford, Yuki
    Chai, High Seng
    Bastarache, Lisa
    Zuvich, Rebecca
    Peissig, Peggy
    Carrell, David
    Ramirez, Andrea H.
    Pathak, Jyotishman
    Wilke, Russell A.
    Rasmussen, Luke
    Wang, Xiaoming
    Pacheco, Jennifer A.
    Kho, Abel N.
    Hayes, M. Geoffrey
    Weston, Noah
    Matsumoto, Martha
    Kopp, Peter A.
    Newton, Katherine M.
    Jarvik, Gail P.
    Li, Rongling
    Manolio, Teri A.
    Kullo, Iftikhar J.
    Chute, Christopher G.
    Chisholm, Rex L.
    Larson, Eric B.
    McCarty, Catherine A.
    Masys, Daniel R.
    Roden, Dan M.
    de Andrade, Mariza
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2011, 89 (04) : 529 - 542
  • [8] Identification of Genomic Predictors of Atrioventricular Conduction Using Electronic Medical Records as a Tool for Genome Science
    Denny, Joshua C.
    Ritchie, Marylyn D.
    Crawford, Dana C.
    Schildcrout, Jonathan S.
    Ramirez, Andrea H.
    Pulley, Jill M.
    Basford, Melissa A.
    Masys, Daniel R.
    Haines, Jonathan L.
    Roden, Dan M.
    [J]. CIRCULATION, 2010, 122 (20) : 2016 - 2021
  • [9] Extracting timing and status descriptors for colonoscopy testing from electronic medical records
    Denny, Joshua C.
    Peterson, Josh F.
    Choma, Neesha N.
    Xu, Hua
    Miller, Randolph A.
    Bastarache, Lisa
    Peterson, Neeraja B.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (04) : 383 - 388
  • [10] Evaluation of a Method to Identify and Categorize Section Headers in Clinical Documents
    Denny, Joshua C.
    Spickard, Anderson, III
    Johnson, Kevin B.
    Peterson, Neeraja B.
    Peterson, Josh F.
    Miller, Randolph A.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (06) : 806 - 815