Automated System to Capture Patient Symptoms From MultitypeJapanese Clinical Texts:Retrospective Study

被引:0
作者
Nishiyama, Tomohiro [1 ]
Yamaguchi, Ayane [2 ]
Han, Peitao [1 ]
Pereira, Lis Weiji Kanashiro [3 ]
Otsuki, Yuka [1 ]
Andrade, Gabriel Herman Bernardim [1 ]
Kudo, Noriko [1 ]
Yada, Shuntaro [1 ]
Wakamiya, Shoko [1 ]
Aramaki, Eiji [1 ]
Takada, Masahiro [2 ,4 ,5 ]
Toi, Masakazu
机构
[1] Nara Inst Sci & Technol, Dept Informat Sci, 8916-5 Takayama Cho, Ikoma 6300192, Japan
[2] Kyoto Univ, Grad Sch Med, Kyoto, Japan
[3] Adv ICT Res Inst, Ctr Informat & Neural Networks, Osaka, Japan
[4] Kansai Med Univ, Dept Breast Surg, Hirakata, Japan
[5] Komagome Hosp, Tokyo Metropolitan Canc & Infect Dis Ctr, Tokyo, Japan
基金
日本科学技术振兴机构; 日本学术振兴会;
关键词
natural language processing; named entity recognition; adverse drug reaction; adverse event; peripheral neuropathy; NLP; symptoms; symptom; machine learning; ML; drug; drugs; pharmacology; pharmaceutic; pharmaceutics; pharmaceuticals; pharmaceutical; medication; medications; adverse; neuropathy; cancer; oncology; text; texts; textual; note; notes; report; reports; EHR; EHRs; record; records; detect; detection; detecting;
D O I
10.2196/58977
中图分类号
R-058 [];
学科分类号
摘要
Background: Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health recordtexts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adversedrug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose anNLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports,radioisotope reports, nursing records, and pharmacist progress notes. Objective: This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitypetexts. The main objective is to detect adverse events accurately using an NLP system. Methods: We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progressnotes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our systemperforms 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documentsfrom multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxelor docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regressionanalysis to compare their performance with each single type of document. All evaluations of detection rates with our system areperformed 30 days after drug administration.Results: Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient betweenthe manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate amongeach type of document, the rate did not match the performance using all documents. The estimated median duration of PN withpaclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number ofevents detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records. Conclusions: Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatmentof PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning anew NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance.Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the datafollow-up period. The results suggest that our method using various types of data can detect more ADEs from clinical documents
引用
收藏
页数:13
相关论文
共 18 条
  • [1] [Anonymous], 2023, MedDic-CANCER-ADE-JA_202306
  • [2] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [3] Huang KX, 2020, Arxiv, DOI arXiv:1904.05342
  • [4] Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines
    Huang, Shih-Cheng
    Pareek, Anuj
    Seyyedi, Saeed
    Banerjee, Imon
    Lungren, Matthew P.
    [J]. NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [5] Laparra Egoitz, 2021, Yearb Med Inform, V30, P239, DOI 10.1055/s-0041-1726522
  • [6] Tasks as needs: reframing the paradigm of clinical natural language processing research for real-world decision support
    Lederman, Asher
    Lederman, Reeva
    Verspoor, Karin
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2022, 29 (10) : 1810 - 1817
  • [7] BioBERT: a pre-trained biomedical language representation model for biomedical text mining
    Lee, Jinhyuk
    Yoon, Wonjin
    Kim, Sungdong
    Kim, Donghyeon
    Kim, Sunkyu
    So, Chan Ho
    Kang, Jaewoo
    [J]. BIOINFORMATICS, 2020, 36 (04) : 1234 - 1240
  • [8] LEVENSHT.VI, 1965, DOKL AKAD NAUK SSSR+, V163, P845
  • [9] DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter
    Magge, Arjun
    Tutubalina, Elena
    Miftahutdinov, Zulfat
    Alimova, Ilseyar
    Dirkson, Anne
    Verberne, Suzan
    Weissenbacher, Davy
    Gonzalez-Hernandez, Graciela
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (10) : 2184 - 2192
  • [10] A Semiautomated Chart Review for Assessing the Development of Radiation Pneumonitis Using Natural Language Processing: Diagnostic Accuracy and Feasibility Study
    McKenzie, Jordan
    Rajapakshe, Rasika
    Shen, Hua
    Rajapakshe, Shan
    Lin, Angela
    [J]. JMIR MEDICAL INFORMATICS, 2021, 9 (11)