A Review of Automatic Phenotyping Approaches using Electronic Health Records

被引:33
作者
Alzoubi, Hadeel [1 ]
Alzubi, Raid [2 ]
Ramzan, Naeem [3 ]
West, Daune [3 ]
Al-Hadhrami, Tawfik [4 ]
Alazab, Mamoun [5 ]
机构
[1] Jordan Univ Sci & Technol, Sch Comp & Informat Technol, Irbid 22110, Jordan
[2] Middle East Univ, Fac Informat Technol, Dept Comp Sci, Amman 11831, Jordan
[3] Univ West Scotland, Sch Engn & Comp, Paisley PA1 2BE, Renfrew, Scotland
[4] Nottingham Trent Univ, Sch Sci & Technol, Nottingham NG11 8NS, England
[5] Charles Darwin Univ, Coll Engn IT & Environm, Darwin, NT 0815, Australia
关键词
electronic health records; phenotyping; natural language processing; machine learning; rule-based; SUPPORT VECTOR MACHINE; MEDICAL-RECORDS; RHEUMATOID-ARTHRITIS; INFLUENZA DETECTION; EXTRACTION SYSTEM; CLINICAL NOTES; TEXT ANALYSIS; IDENTIFICATION; IDENTIFY; VALIDATION;
D O I
10.3390/electronics8111235
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Electronic Health Records (EHR) are a rich repository of valuable clinical information that exist in primary and secondary care databases. In order to utilize EHRs for medical observational research a range of algorithms for automatically identifying individuals with a specific phenotype have been developed. This review summarizes and offers a critical evaluation of the literature relating to studies conducted into the development of EHR phenotyping systems. This review describes phenotyping systems and techniques based on structured and unstructured EHR data. Articles published on PubMed and Google scholar between 2013 and 2017 have been reviewed, using search terms derived from Medical Subject Headings (MeSH). The popularity of using Natural Language Processing (NLP) techniques in extracting features from narrative text has increased. This increased attention is due to the availability of open source NLP algorithms, combined with accuracy improvement. In this review, Concept extraction is the most popular NLP technique since it has been used by more than 50% of the reviewed papers to extract features from EHR. High-throughput phenotyping systems using unsupervised machine learning techniques have gained more popularity due to their ability to efficiently and automatically extract a phenotype with minimal human effort.
引用
收藏
页数:23
相关论文
共 127 条
  • [1] Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis
    Abhyankar, Swapna
    Demner-Fushman, Dina
    Callaghan, Fiona M.
    McDonald, Clement J.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 801 - 807
  • [2] Afzal N, 2016, 2016 3RD IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS, P126, DOI 10.1109/BHI.2016.7455851
  • [3] Automatic generation of case-detection algorithms to identify children with asthma from large electronic health record databases
    Afzal, Zubair
    Engelkes, Marjolein
    Verhamme, Katia M. C.
    Janssens, Hettie M.
    Sturkenboom, Miriam C. J. M.
    Kors, Jan A.
    Schuemie, Martijn J.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2013, 22 (08) : 826 - 833
  • [4] Improving sensitivity of machine learning methods for automated case identification from free-text electronic medical records
    Afzal, Zubair
    Schuemie, Martijn J.
    van Blijderveen, Jan C.
    Sen, Elif F.
    Sturkenboom, Miriam C. J. M.
    Kors, Jan A.
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2013, 13
  • [5] Alpaydin E, 2014, ADAPT COMPUT MACH LE, P115
  • [6] Alzoubi H, 2018, 2018 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRONICS & COMMUNICATIONS ENGINEERING (ICCECE), P41, DOI 10.1109/iCCECOME.2018.8658578
  • [7] A Hybrid Feature Selection Method for Complex Diseases SNPs
    Alzubi, Raid
    Ramzan, Naeem
    Alzoubi, Hadeel
    Amira, Abbes
    [J]. IEEE ACCESS, 2018, 6 : 1292 - 1301
  • [8] Identification of Nonresponse to Treatment Using Narrative Data in an Electronic Health Record Inflammatory Bowel Disease Cohort
    Ananthakrishnan, Ashwin N.
    Cagan, Andrew
    Cai, Tianxi
    Gainer, Vivian S.
    Shaw, Stanley Y.
    Savova, Guergana
    Churchill, Susanne
    Karlson, Elizabeth W.
    Murphy, Shawn N.
    Liao, Katherine P.
    Kohane, Isaac
    [J]. INFLAMMATORY BOWEL DISEASES, 2016, 22 (01) : 151 - 158
  • [9] Improving Case Definition of Crohn's Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing: A Novel Informatics Approach
    Ananthakrishnan, Ashwin N.
    Cai, Tianxi
    Savova, Guergana
    Cheng, Su-Chun
    Chen, Pei
    Perez, Raul Guzman
    Gainer, Vivian S.
    Murphy, Shawn N.
    Szolovits, Peter
    Xia, Zongqi
    Shaw, Stanley
    Churchill, Susanne
    Karlson, Elizabeth W.
    Kohane, Isaac
    Plenge, Robert M.
    Liao, Katherine P.
    [J]. INFLAMMATORY BOWEL DISEASES, 2013, 19 (07) : 1411 - 1420
  • [10] Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: A cross-sectional, unselected, retrospective study
    Anderson, Ariana E.
    Kerr, Wesley T.
    Thames, April
    Li, Tong
    Xiao, Jiayang
    Cohen, Mark S.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 60 : 162 - 168