Natural Language Processing in Electronic Health Records in relation to healthcare decision-making: A systematic review

被引:85
作者
Hossain, Elias [1 ]
Rana, Rajib [2 ]
Higgins, Niall [3 ,4 ,5 ]
Soar, Jeffrey [6 ]
Barua, Prabal Datta [6 ]
Pisani, Anthony R. [7 ]
Turner, Kathryn [4 ]
机构
[1] North South Univ, Sch Engn & Phys Sci, Dhaka 1229, Bangladesh
[2] Univ Southern Queensland, Sch Math Phys & Comp, Springfield Cent 4300, Australia
[3] Univ Southern Queensland, Sch Management & Enterprise, Darling Hts, QLD 4350, Australia
[4] Queensland Univ Technol, Sch Nursing, Brisbane, Qld 4000, Australia
[5] Metro North Mental Hlth, Herston, Qld 4029, Australia
[6] Univ Southern Queensland, Sch Business, Springfield Cent, Qld 4300, Australia
[7] Univ Rochester, Ctr Study & Prevent Suicide, Rochester, NY USA
基金
美国国家卫生研究院;
关键词
Machine learning; Electronic Health Records; Medical natural language processing; Artificial intelligence in medicine; Automated tools; State-of-the-art deep learning; CLINICAL INFORMATION EXTRACTION; TEXT CLASSIFICATION; AUTOMATED DETECTION; IDENTIFICATION; ARCHITECTURE; FRAMEWORK; NLP;
D O I
10.1016/j.compbiomed.2023.106649
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Natural Language Processing (NLP) is widely used to extract clinical insights from Electronic Health Records (EHRs). However, the lack of annotated data, automated tools, and other challenges hinder the full utilisation of NLP for EHRs. Various Machine Learning (ML), Deep Learning (DL) and NLP techniques are studied and compared to understand the limitations and opportunities in this space comprehensively.Methodology: After screening 261 articles from 11 databases, we included 127 papers for full-text review covering seven categories of articles: (1) medical note classification, (2) clinical entity recognition, (3) text summarisation, (4) deep learning (DL) and transfer learning architecture, (5) information extraction, (6) Medical language translation and (7) other NLP applications. This study follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.Result and Discussion: EHR was the most commonly used data type among the selected articles, and the datasets were primarily unstructured. Various ML and DL methods were used, with prediction or classification being the most common application of ML or DL. The most common use cases were: the International Classification of Diseases, Ninth Revision (ICD-9) classification, clinical note analysis, and named entity recognition (NER) for clinical descriptions and research on psychiatric disorders.Conclusion: We find that the adopted ML models were not adequately assessed. In addition, the data imbalance problem is quite important, yet we must find techniques to address this underlining problem. Future studies should address key limitations in studies, primarily identifying Lupus Nephritis, Suicide Attempts, perinatal self-harmed and ICD-9 classification.
引用
收藏
页数:24
相关论文
共 155 条
[1]  
Borkowski AA, 2019, Arxiv, DOI arXiv:1903.08057
[2]   Subtypes in patients with opioid misuse: A prognostic enrichment strategy using electronic health record data in hospitalized patients [J].
Afshar, Majid ;
Joyce, Cara ;
Dligach, Dmitriy ;
Sharma, Brihat ;
Kania, Robert ;
Xie, Meng ;
Swope, Kristin ;
Salisbury-Afshar, Elizabeth ;
Karnik, Niranjan S. .
PLOS ONE, 2019, 14 (07)
[3]   Natural language processing of clinical notes for identification of critical limb ischemia [J].
Afzal, Naveed ;
Mallipeddi, Vishnu Priya ;
Sohn, Sunghwan ;
Liu, Hongfang ;
Chaudhry, Rajeev ;
Scott, Christopher G. ;
Kullo, Iftikhar J. ;
Arruda-Olson, Adelaide M. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 111 :83-89
[4]   De-identification of electronic health record using neural network [J].
Ahmed, Tanbir ;
Al Aziz, Md Momin ;
Mohammed, Noman .
SCIENTIFIC REPORTS, 2020, 10 (01)
[5]  
Al-Aiad A., 2018, 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA), P1, DOI 10.1109/AICCSA.2018.8612827
[6]  
Alaa AM, 2018, PR MACH LEARN RES, V80
[7]   A Review of Automatic Phenotyping Approaches using Electronic Health Records [J].
Alzoubi, Hadeel ;
Alzubi, Raid ;
Ramzan, Naeem ;
West, Daune ;
Al-Hadhrami, Tawfik ;
Alazab, Mamoun .
ELECTRONICS, 2019, 8 (11)
[8]   Monitoring Suicidal Patients in Primary Care Using Electronic Health Records [J].
Anderson, Heather D. ;
Pace, Wilson D. ;
Brandt, Elias ;
Nielsen, Rodney D. ;
Allen, Richard R. ;
Libby, Anne M. ;
West, David R. ;
Valuck, Robert J. .
JOURNAL OF THE AMERICAN BOARD OF FAMILY MEDICINE, 2015, 28 (01) :65-71
[9]   Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: a retrospective cohort study [J].
Anzaldi, Laura J. ;
Davison, Ashwini ;
Boyd, Cynthia M. ;
Leff, Bruce ;
Kharrazi, Hadi .
BMC GERIATRICS, 2017, 17
[10]   Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records [J].
Ayre, Karyn ;
Bittar, Andre ;
Kam, Joyce ;
Verma, Somain ;
Howard, Louise M. ;
Dutta, Rina .
PLOS ONE, 2021, 16 (08)