Interpretable segmentation of medical free-text records based on word embeddings

被引:6
|
作者
Dobrakowski, Adam Gabriel [1 ]
Mykowiecka, Agnieszka [2 ]
Marciniak, Malgorzata [2 ]
Jaworski, Wojciech [1 ]
Biecek, Przemyslaw [1 ,3 ]
机构
[1] Univ Warsaw, Banacha 2, Warsaw, Poland
[2] Polish Acad Sci, Inst Comp Sci, Jana Kazimierza 5, Warsaw, Poland
[3] Warsaw Univ Technol, Koszykowa 75, Warsaw, Poland
关键词
Electronic health records; Natural language processing; Text clustering; Word embeddings;
D O I
10.1007/s10844-021-00659-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical free-text records store a lot of useful information that can be exploited in developing computer-supported medicine. However, extracting the knowledge from the unstructured text is difficult and depends on the language. In the paper, we apply Natural Language Processing methods to process raw medical texts in Polish and propose a new methodology for clustering of patients' visits. We (1) extract medical terminology from a corpus of free-text clinical records, (2) annotate data with medical concepts, (3) compute vector representations of medical concepts and validate them on the proposed term analogy tasks, (4) compute visit representations as vectors, (5) introduce a new method for clustering of patients' visits and (6) apply the method to a corpus of 100,000 visits. We use several approaches to visual exploration that facilitate interpretation of segments. With our method, we obtain stable and separated segments of visits which are positively validated against final medical diagnoses. In this paper we show how algorithm for segmentation of medical free-text records may be used to aid medical doctors. In addition to this, we share implementation of described methods with examples as open-source R package memr.
引用
收藏
页码:447 / 465
页数:19
相关论文
共 50 条
  • [41] Arabic Text Classification Based on Word and Document Embeddings
    El Mahdaouy, Abdelkader
    Gaussier, Eric
    El Alaoui, Said Ouatik
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 32 - 41
  • [42] SPARSE VARIATIONAL AUTOENCODER-BASED INTERPRETABLE BIMODAL WORD EMBEDDINGS
    Tang, Jingyao
    Zhong, Weiyu
    Cai, Qianhua
    Lu, Guojun
    Yan, Zehao
    Xue, Yun
    Li, Xinguang
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 139 - 144
  • [43] Words prediction based on N-gram model for free-text entry in electronic health records
    Azita Yazdani
    Reza Safdari
    Ali Golkar
    Sharareh R. Niakan Kalhori
    Health Information Science and Systems, 7
  • [44] Words prediction based on N-gram model for free-text entry in electronic health records
    Yazdani, Azita
    Safdari, Reza
    Golkar, Ali
    Kalhori, Sharareh R. Niakan
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2019, 7 (1)
  • [45] Approaches to text mining for analyzing treatment plan of quit smoking with free-text medical records A PRISMA-compliant meta-analysis
    Huang, Hsien-Liang
    Hong, Shi-Hao
    Tsai, Yun-Cheng
    MEDICINE, 2020, 99 (29) : E20999
  • [46] Response to commentaries on 'Should free-text data in electronic medical records be shared for research? A citizens' jury study in the UK'
    Ford, Elizabeth
    Oswald, Malcolm
    JOURNAL OF MEDICAL ETHICS, 2020, 46 (06) : 384 - 385
  • [47] A neuro-symbolic method for understanding free-text medical evidence
    Kang, Tian
    Turfah, Ali
    Kim, Jaehyun
    Perotte, Adler
    Weng, Chunhua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (08) : 1703 - 1711
  • [48] Biometric Recognition Based on Free-Text Keystroke Dynamics
    Ahmed, Ahmed A.
    Traore, Issa
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (04) : 458 - 472
  • [49] Web-Based Application Based on Human-in-the-Loop Deep Learning for Deidentifying Free-Text Data in Electronic Medical Records: Development and Usability Study
    Liu, Leibo
    Perez-Concha, Oscar
    Nguyen, Anthony
    Bennett, Vicki
    Blake, Victoria
    Gallego, Blanca
    Jorm, Louisa
    INTERACTIVE JOURNAL OF MEDICAL RESEARCH, 2023, 12
  • [50] RELATIONAL DATA-BASE MODELING OF FREE-TEXT MEDICAL NARRATIVE
    CHI, EC
    SAGER, N
    TICK, LJ
    LYMAN, MS
    MEDICAL INFORMATICS, 1983, 8 (03): : 209 - 223