A general text mining method to extract echocardiography measurement results from echocardiography documents

被引:4
作者
Szeker, Szabolcs [1 ]
Fogarassy, Gyorgy [2 ]
Vathy-Fogarassy, Agnes [1 ]
机构
[1] Univ Pannonia, Dept Comp Sci & Syst Technol, Veszprem, Hungary
[2] State Hosp Cardiol, Dept Cardiol 1, Balatonfured, Hungary
关键词
Information extraction; Clinical text mining; Echocardiography report; Named entity recognition; Natural language processing; INFORMATION EXTRACTION; ENTITY RECOGNITION; EJECTION FRACTION; DISTANCE METRICS; SEARCH;
D O I
10.1016/j.artmed.2023.102584
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background: In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports.Method: The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document.Results: The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948.Conclusion: The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts.
引用
收藏
页数:8
相关论文
共 36 条
[1]   Rule-Based Extraction of Family History Information from Clinical Notes [J].
Almeida, Joao Rafael ;
Matos, Sergio .
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, :670-675
[2]  
Bao XY, 2019, 2019 6TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS, ICSAI, P1438
[3]   Medical prescription classification: a NLP-based approach [J].
Carchiolo, Vincenza ;
Longheu, Alessandro ;
Reitano, Giuseppa ;
Zagarella, Luca .
PROCEEDINGS OF THE 2019 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2019, :605-609
[4]   Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT [J].
Chen, Peng ;
Zhang, Meng ;
Yu, Xiaosheng ;
Li, Songpu .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
[5]   Text mining occupations from the mental health electronic health record: a natural language processing approach using records from the Clinical Record Interactive Search (CRIS) platform in south London, UK [J].
Chilman, Natasha ;
Song, Xingyi ;
Roberts, Angus ;
Tolani, Esther ;
Stewart, Robert ;
Chui, Zoe ;
Birnie, Karen ;
Harber-Aschan, Lisa ;
Gazard, Billy ;
Chandran, David ;
Sanyal, Jyoti ;
Hatch, Stephani ;
Kolliakou, Anna ;
Das-Munshi, Jayati .
BMJ OPEN, 2021, 11 (03)
[6]   Accuracy of claim data in the identification and classification of adults with congenital heart diseases in electronic medical records [J].
Cohen, Sarah ;
Jannot, Anne-Sophie ;
Iserin, Laurence ;
Bonnet, Damien ;
Burgun, Anita ;
Escudie, Jean-Baptiste .
ARCHIVES OF CARDIOVASCULAR DISEASES, 2019, 112 (01) :31-43
[7]  
Donnelly K, 2006, STUD HEALTH TECHNOL, V121, P279
[8]   Extracting and classifying diagnosis dates from clinical notes: A case study [J].
Fu, Julia T. ;
Sholle, Evan ;
Krichevsky, Spencer ;
Scandura, Joseph ;
Campion, Thomas R. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 110
[9]   Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure [J].
Garvin, Jennifer H. ;
DuVall, Scott L. ;
South, Brett R. ;
Bray, Bruce E. ;
Bolton, Daniel ;
Heavirland, Julia ;
Pickard, Steve ;
Heidenreich, Paul ;
Shen, Shuying ;
Weir, Charlene ;
Samore, Matthew ;
Goldstein, Mary K. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2012, 19 (05) :859-866
[10]  
Grishman R., 1996, MESSAGE UNDERSTANDIN, P466, DOI [10.3115/992628.992709, DOI 10.3115/992628.992709]