Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review

被引:16
作者
Sim, Jin-ah [1 ,2 ]
Huang, Xiaolei [3 ]
Horan, Madeline R. [1 ]
Stewart, Christopher M. [4 ]
Robison, Leslie L. [1 ]
Hudson, Melissa M. [1 ,5 ]
Baker, Justin N. [6 ]
Huang, I-Chan [1 ,7 ]
机构
[1] St Jude Childrens Res Hosp, Dept Epidemiol & Canc Control, Memphis, TN USA
[2] Hallym Univ, Sch AI Convergence, Chunchon, South Korea
[3] Univ Memphis, Dept Comp Sci, Memphis, TN USA
[4] Univ Memphis, Inst Intelligent Syst, Memphis, TN USA
[5] St Jude Childrens Res Hosp, Dept Oncol, Memphis, TN USA
[6] Stanford Univ, Dept Genet, Stanford, CA USA
[7] St Jude Childrens Res Hosp, Dept Epidemiol & Canc Control, 262 Danny Thomas Pl, MS735, Memphis, TN 38105 USA
基金
美国国家科学基金会;
关键词
Natural language processing; Machine learning; Patient-reported outcomes; Electronic health records; Unstructured clinical narrative; RESEARCH DOMAIN CRITERIA; CLINICAL NOTES; HEART-FAILURE; MEDICAL-RECORDS; LARGE-SCALE; TEXT; RISK; CARE; DOCUMENTATION; DEPRESSION;
D O I
10.1016/j.artmed.2023.102701
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care.Methods: We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs.Results: Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non neural ML algorithms embedded in NLP.Conclusions: This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
引用
收藏
页数:11
相关论文
共 103 条
  • [1] Ajami Sima, 2013, Mater Sociomed, V25, P213, DOI 10.5455/msm.2013.25.213-215
  • [2] Alzu'bi A.A., 2021, Perspect Health Inf Manag, V18, p1g
  • [3] A Rule-Based Approach to Embedding Techniques for Text Document Classification
    Aubaid, Asmaa M.
    Mishra, Alok
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [4] Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
    Banerjee, Imon
    Li, Kevin
    Seneviratne, Martin
    Ferrari, Michelle
    Seto, Tina
    Brooks, James D.
    Rubin, Daniel L.
    Hernandez-Boussard, Tina
    [J]. JAMIA OPEN, 2019, 2 (01) : 150 - 159
  • [5] The opportunities and pitfalls of ChatGPT in clinical and translational medicine
    Baumgartner, Christian
    [J]. CLINICAL AND TRANSLATIONAL MEDICINE, 2023, 13 (03):
  • [6] Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records
    Byrd, Roy J.
    Steinhubl, Steven R.
    Sun, Jimeng
    Ebadollahi, Shahram
    Stewart, Walter F.
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2014, 83 (12) : 983 - 992
  • [7] Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios
    Cascella, Marco
    Montomoli, Jonathan
    Bellini, Valentina
    Bignami, Elena
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2023, 47 (01)
  • [8] Natural language processing of electronic health records is superior to billing codes to identify symptom burden in hemodialysis patients
    Chan, Lili
    Beers, Kelly
    Yau, Amy A.
    Chauhan, Kinsuk
    Duffy, Aine
    Chaudhary, Kumardeep
    Debnath, Neha
    Saha, Aparna
    Pattharanitima, Pattharawin
    Cho, Judy
    Kotanko, Peter
    Federman, Alex
    Coca, Steven G.
    Van Vleck, Tielman
    Nadkarni, Girish N.
    [J]. KIDNEY INTERNATIONAL, 2020, 97 (02) : 383 - 392
  • [9] Use of Natural Language Processing to identify Obsessive Compulsive Symptoms in patients with schizophrenia, schizoaffective disorder or bipolar disorder
    Chandran, David
    Robbins, Deborah Ahn
    Chang, Chin-Kuo
    Shetty, Hitesh
    Sanyal, Jyoti
    Downs, Johnny
    Fok, Marcella
    Ball, Michael
    Jackson, Richard
    Stewart, Robert
    Cohen, Hannah
    Vermeulen, Jentien M.
    Schirmbeck, Frederike
    de Haan, Lieuwe
    Hayes, Richard
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [10] Generating contextual embeddings for emergency department chief complaints
    Chang, David
    Hong, Woo Suk
    Taylor, Richard Andrew
    [J]. JAMIA OPEN, 2020, 3 (02) : 160 - 166