Extraction, Labeling, Clustering, and Semantic Mapping of Segments From Clinical Notes

被引:3
作者
Zelina, Petr [1 ]
Halamkova, Jana [2 ,3 ]
Novacek, Vit [1 ,2 ,4 ]
机构
[1] Masaryk Univ, Fac Informat, Brno 60177, Czech Republic
[2] Masaryk Mem Canc Inst, Dept Comprehens Canc Care, Brno 65653, Czech Republic
[3] Masaryk Univ, Fac Med, Brno 60177, Czech Republic
[4] NUI Galway, Data Sci Inst, Galway H91 TK33, Ireland
关键词
Task analysis; Semantics; Feature extraction; Ontologies; Nanobioscience; Measurement; Clinical diagnosis; Text categorization; Information retrieval; NLP; EHR; clinical notes; information extraction; text classification;
D O I
10.1109/TNB.2023.3275195
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This work is motivated by the scarcity of tools for accurate, unsupervised information extraction from unstructured clinical notes in computationally underrepresented languages, such as Czech. We introduce a stepping stone to a broad array of downstream tasks such as summarisation or integration of individual patient records, extraction of structured information for national cancer registry reporting or building of semi-structured semantic patient representations that can be used for computing patient embeddings. More specifically, we present a method for unsupervised extraction of semantically-labeled textual segments from clinical notes and test it out on a dataset of Czech breast cancer patients, provided by Masaryk Memorial Cancer Institute (the largest Czech hospital specialising exclusively in oncology). Our goal was to extract, classify (i.e. label) and cluster segments of the free-text notes that correspond to specific clinical features (e.g., family background, comorbidities or toxicities). Finally, we propose a tool for computer-assisted semantic mapping of segment types to pre-defined ontologies and validate it on a downstream task of category-specific patient similarity. The presented results demonstrate the practical relevance of the proposed approach for building more sophisticated extraction and analytical pipelines deployed on Czech clinical notes.
引用
收藏
页码:781 / 788
页数:8
相关论文
共 50 条
  • [41] Semantic Relation Extraction from Cultural Heritage Archives
    Buranasing, Watchira
    Lilakiataskun, Woraphon
    JOURNAL OF WEB ENGINEERING, 2022, 21 (04): : 1081 - 1102
  • [42] Automatic extraction of corollaries from semantic structure of text
    Nurtazin, Abyz T.
    Khisamiev, Zarif G.
    OPEN ENGINEERING, 2016, 6 (01): : 353 - 358
  • [43] A Hybrid System for Emotion Extraction from Suicide Notes
    Nikfarjam, Azadeh
    Emadzadeh, Ehsan
    Gonzalez, Graciela
    BIOMEDICAL INFORMATICS INSIGHTS, 2012, 5 : 165 - 174
  • [44] A Review of Building Extraction From Remote Sensing Imagery: Geometrical Structures and Semantic Attributes
    Li, Qingyu
    Mou, Lichao
    Sun, Yao
    Hua, Yuansheng
    Shi, Yilei
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [45] Towards a Semantic Web: Ontology Development based on the Extraction of Semantic Concepts from Digital Documents
    Abascal Mena, Rocio
    PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 519 - +
  • [46] Extraction of pragmatic and semantic salience from spontaneous spoken English
    Zhang, T
    Hasegawa-Johnson, M
    Levinson, SE
    SPEECH COMMUNICATION, 2006, 48 (3-4) : 437 - 462
  • [47] Semantic Interoperability in Astrophysics for Workflows Extraction from Heterogeneous Services
    Louge, Thierry
    Karray, Mohamed Hedi
    Archimede, Bernard
    Knodlseder, Juergen
    ENTERPRISE INTEROPERABILITY, IWEI 2015, 2015, 213 : 3 - 15
  • [48] Pictorial Visualization of EMR Summary Interface and Medical Information Extraction of Clinical Notes
    Ruan, Wei
    Appasani, Naveenkumar
    Kim, Katherine
    Vincelli, Joseph
    Kim, Hyun
    Lee, Won-Sook
    2018 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (CIVEMSA), 2018,
  • [49] Arc/line segments extraction from unknown indoor environment with laser sensor
    Yan, Rui-Jun
    Wu, Jing
    Shao, Ming-Lei
    Lee, Ji-Yeong
    Han, Chang-Soo
    PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 18TH '13), 2013, : 500 - 503
  • [50] Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes
    Steinkamp, Jackson M.
    Bala, Wasif
    Sharma, Abhinav
    Kantrowitz, Jacob J.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 102