Extraction, Labeling, Clustering, and Semantic Mapping of Segments From Clinical Notes

被引:3
作者
Zelina, Petr [1 ]
Halamkova, Jana [2 ,3 ]
Novacek, Vit [1 ,2 ,4 ]
机构
[1] Masaryk Univ, Fac Informat, Brno 60177, Czech Republic
[2] Masaryk Mem Canc Inst, Dept Comprehens Canc Care, Brno 65653, Czech Republic
[3] Masaryk Univ, Fac Med, Brno 60177, Czech Republic
[4] NUI Galway, Data Sci Inst, Galway H91 TK33, Ireland
关键词
Task analysis; Semantics; Feature extraction; Ontologies; Nanobioscience; Measurement; Clinical diagnosis; Text categorization; Information retrieval; NLP; EHR; clinical notes; information extraction; text classification;
D O I
10.1109/TNB.2023.3275195
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This work is motivated by the scarcity of tools for accurate, unsupervised information extraction from unstructured clinical notes in computationally underrepresented languages, such as Czech. We introduce a stepping stone to a broad array of downstream tasks such as summarisation or integration of individual patient records, extraction of structured information for national cancer registry reporting or building of semi-structured semantic patient representations that can be used for computing patient embeddings. More specifically, we present a method for unsupervised extraction of semantically-labeled textual segments from clinical notes and test it out on a dataset of Czech breast cancer patients, provided by Masaryk Memorial Cancer Institute (the largest Czech hospital specialising exclusively in oncology). Our goal was to extract, classify (i.e. label) and cluster segments of the free-text notes that correspond to specific clinical features (e.g., family background, comorbidities or toxicities). Finally, we propose a tool for computer-assisted semantic mapping of segment types to pre-defined ontologies and validate it on a downstream task of category-specific patient similarity. The presented results demonstrate the practical relevance of the proposed approach for building more sophisticated extraction and analytical pipelines deployed on Czech clinical notes.
引用
收藏
页码:781 / 788
页数:8
相关论文
共 50 条
  • [31] HTNSystem: Hypertension information extraction system for unstructured clinical notes
    Jonnagaddala, Jitendra
    Liaw, Siaw-Teng
    Ray, Pradeep
    Kumar, Manish
    Dai, Hong-Jie
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8916 : 219 - 227
  • [32] Ensembling Classical Machine Learning and Deep Learning Approaches for Morbidity Identification From Clinical Notes
    Kumar, Vivek
    Recupero, Diego Reforgiato
    Riboni, Daniele
    Helaoui, Rim
    IEEE ACCESS, 2021, 9 (09): : 7107 - 7126
  • [33] Boundary Enhancement Semantic Segmentation for Building Extraction From Remote Sensed Image
    Jung, Hoin
    Choi, Han-Soo
    Kang, Myungjoo
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] Classification of Clinical Notes from a Heart Failure Telehealth Network
    Wiesmueller, Fabian
    Lauschenski, Aaron
    Baumgartner, Martin
    Hayn, Dieter
    Kreiner, Karl
    Fetz, Bettina
    Brunelli, Luca
    Poelzl, Gerhard
    Pfeifer, Bernhard
    Neururer, Sabrina
    Schreier, Guenter
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 803 - 807
  • [35] A two-stage workflow to extract and harmonize drug mentions from clinical notes into observational databases
    Almeida, Joao Rafael
    Silva, Joao Figueira
    Matos, Sergio
    Oliveira, Jose Luis
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 120
  • [36] Ontology-based clinical information extraction from physician's free-text notes
    Yehia, Engy
    Boshnak, Hussein
    AbdelGaber, Sayed
    Abdo, Amany
    Elzanfaly, Doaa S.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 98
  • [37] Automatic semantic relation extraction from Portuguese texts
    Taba, Leonardo Sameshima
    Caseli, Helena de Medeiros
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2739 - 2746
  • [38] Information Extraction from Text Based on Semantic Inferentialism
    Pinheiro, Vladia
    Pequeno, Tarcisio
    Furtado, Vasco
    Nogueira, Douglas
    FLEXIBLE QUERY ANSWERING SYSTEMS: 8TH INTERNATIONAL CONFERENCE, FQAS 2009, 2009, 5822 : 333 - 344
  • [39] Automatic Extraction of Semantic Relations from Text Documents
    Ta, Chien D. C.
    Tuoi Phan Thi
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2016, 2016, 10018 : 344 - 351
  • [40] Information extraction using semantic patterns for populating clinical data models
    Meng, F
    Chen, AA
    Son, RY
    Taira, RK
    Churchill, BM
    Kangarloo, H
    METMBS '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, 2004, : 10 - 16