Extraction, Labeling, Clustering, and Semantic Mapping of Segments From Clinical Notes

被引:3
作者
Zelina, Petr [1 ]
Halamkova, Jana [2 ,3 ]
Novacek, Vit [1 ,2 ,4 ]
机构
[1] Masaryk Univ, Fac Informat, Brno 60177, Czech Republic
[2] Masaryk Mem Canc Inst, Dept Comprehens Canc Care, Brno 65653, Czech Republic
[3] Masaryk Univ, Fac Med, Brno 60177, Czech Republic
[4] NUI Galway, Data Sci Inst, Galway H91 TK33, Ireland
关键词
Task analysis; Semantics; Feature extraction; Ontologies; Nanobioscience; Measurement; Clinical diagnosis; Text categorization; Information retrieval; NLP; EHR; clinical notes; information extraction; text classification;
D O I
10.1109/TNB.2023.3275195
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This work is motivated by the scarcity of tools for accurate, unsupervised information extraction from unstructured clinical notes in computationally underrepresented languages, such as Czech. We introduce a stepping stone to a broad array of downstream tasks such as summarisation or integration of individual patient records, extraction of structured information for national cancer registry reporting or building of semi-structured semantic patient representations that can be used for computing patient embeddings. More specifically, we present a method for unsupervised extraction of semantically-labeled textual segments from clinical notes and test it out on a dataset of Czech breast cancer patients, provided by Masaryk Memorial Cancer Institute (the largest Czech hospital specialising exclusively in oncology). Our goal was to extract, classify (i.e. label) and cluster segments of the free-text notes that correspond to specific clinical features (e.g., family background, comorbidities or toxicities). Finally, we propose a tool for computer-assisted semantic mapping of segment types to pre-defined ontologies and validate it on a downstream task of category-specific patient similarity. The presented results demonstrate the practical relevance of the proposed approach for building more sophisticated extraction and analytical pipelines deployed on Czech clinical notes.
引用
收藏
页码:781 / 788
页数:8
相关论文
共 50 条
  • [1] Semantic based Clinical Notes Mining for Factual Information Extraction
    Hussain, Musarrat
    Choi, Dong-Ju
    Lee, Sungyoung
    2020 34TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2020), 2020, : 46 - 48
  • [2] A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction
    Li, Qi
    Zhai, Haijun
    Deleger, Louise
    Lingren, Todd
    Kaiser, Megan
    Stoutenborough, Laura
    Solti, Imre
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (05) : 915 - 921
  • [3] Intelligent Clinical Notes System: An Information Retrieval and Information Extraction System for Clinical Notes
    Patrick, Jon
    Li, Min
    2009 11TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM 2009), 2009, : 108 - 115
  • [4] Automatic Extraction and Aggregation of Diseases from Clinical Notes
    Reategui, Ruth
    Ratte, Sylvie
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY & SYSTEMS (ICITS 2018), 2018, 721 : 845 - 854
  • [5] SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research
    Wu, Honghan
    Toti, Giulia
    Morley, Katherine I.
    Ibrahim, Zina M.
    Folarin, Amos
    Jackson, Richard
    Kartoglu, Ismail
    Agrawal, Asha
    Stringer, Clive
    Gale, Darren
    Gorrell, Genevieve
    Roberts, Angus
    Broadbent, Matthew
    Stewart, Robert
    Dobson, Richard J. B.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2018, 25 (05) : 530 - 537
  • [6] Extraction of Family History Information From Clinical Notes: Deep Learning and Heuristics Approach
    Silva, Joao Figueira
    Almeida, Joao Rafael
    Matos, Sergio
    JMIR MEDICAL INFORMATICS, 2020, 8 (12)
  • [7] Clustering of Research Documents - A Survey on Semantic Analysis and Keyword Extraction
    Nair, Srikesh Rajesh
    Gokul, G.
    Vadakkan, Akshay Anto
    Pillai, Aditya G.
    Thushara, M. G.
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [8] A Method for Extraction of Future Reference Sentences Based on Semantic Role Labeling
    Nakajima, Yoko
    Ptaszynski, Michal
    Honma, Hirotoshi
    Masui, Fumito
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (02): : 514 - 524
  • [9] Event-based knowledge extraction from free-text descriptions for art images by using semantic role labeling approaches
    Lin, Chia-Hung
    Yen, Chia-Wei
    Hong, Jen-Shin
    Cruz-Lara, Samuel
    ELECTRONIC LIBRARY, 2008, 26 (02) : 215 - 225
  • [10] Lung Cancer Diagnosis Extraction from Clinical Notes Written in Spanish
    Solarte-Pabon, Oswaldo
    Torrente, Maria
    Rodriguez-Gonzalez, Alejandro
    Provencio, Mariano
    Menasalvas, Ernestina
    Tunas, Juan Manuel
    2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, : 492 - 497