Deep Learning Analysis of Polish Electronic Health Records for Diagnosis Prediction in Patients with Cardiovascular Diseases

被引:6
作者
Anetta, Kristof [1 ]
Horak, Ales [1 ]
Wojakowski, Wojciech [2 ]
Wita, Krystian [3 ]
Jadczyk, Tomasz [2 ,4 ]
机构
[1] Masaryk Univ, Fac Informat, Nat Language Proc Ctr, Brno 60200, Czech Republic
[2] Med Univ Silesia, Sch Med Katowice, Dept Cardiol & Struct Heart Dis, PL-40055 Katowice, Poland
[3] Med Univ Silesia, Dept Cardiol 1, PL-40055 Katowice, Poland
[4] St Annes Univ Hosp Brno, Intervent Cardiac Electrophysiol Grp, Int Clin Res Ctr, Brno 65691, Czech Republic
关键词
electronic health records; deep learning; text analysis; diagnosis prediction; Polish language; HEART-FAILURE; TEMPORAL TRENDS; RISK;
D O I
10.3390/jpm12060869
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Electronic health records naturally contain most of the medical information in the form of doctor's notes as unstructured or semi-structured texts. Current deep learning text analysis approaches allow researchers to reveal the inner semantics of text information and even identify hidden consequences that can offer extra decision support to doctors. In the presented article, we offer a new automated analysis of Polish summary texts of patient hospitalizations. The presented models were found to be able to predict the final diagnosis with almost 70% accuracy based just on the patient's medical history (only 132 words on average), with possible accuracy increases when adding further sentences from hospitalization results; even one sentence was found to improve the results by 4%, and the best accuracy of 78% was achieved with five extra sentences. In addition to detailed descriptions of the data and methodology, we present an evaluation of the analysis using more than 50,000 Polish cardiology patient texts and dive into a detailed error analysis of the approach. The results indicate that the deep analysis of just the medical history summary can suggest the direction of diagnosis with a high probability that can be further increased just by supplementing the records with further examination results.
引用
收藏
页数:17
相关论文
共 66 条
[1]   Problems and Barriers during the Process of Clinical Coding: a Focus Group Study of Coders' Perceptions [J].
Alonso, Vera ;
Santos, Joao Vasco ;
Pinto, Marta ;
Ferreira, Joana ;
Lema, Isabel ;
Lopes, Fernando ;
Freitas, Alberto .
JOURNAL OF MEDICAL SYSTEMS, 2020, 44 (03)
[2]   Readmission prediction using deep learning on electronic health records [J].
Ashfaq, Awais ;
Sant'Anna, Anita ;
Lingman, Markus ;
Nowaczyk, Slawomir .
JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 97
[3]  
Benjamin EJ, 2019, CIRCULATION, V139, pE56, DOI [10.1161/CIR.0000000000000659, 10.1161/CIR.0000000000000746]
[4]   A Robust e-Epidemiology Tool in Phenotyping Heart Failure with Differentiation for Preserved and Reduced Ejection Fraction: the Electronic Medical Records and Genomics (eMERGE) Network [J].
Bielinski, Suzette J. ;
Pathak, Jyotishman ;
Carrell, David S. ;
Takahashi, Paul Y. ;
Olson, Janet E. ;
Larson, Nicholas B. ;
Liu, Hongfang ;
Sohn, Sunghwan ;
Wells, Quinn S. ;
Denny, Joshua C. ;
Rasmussen-Torvik, Laura J. ;
Pacheco, Jennifer Allen ;
Jackson, Kathryn L. ;
Lesnick, Timothy G. ;
Gullerud, Rachel E. ;
Decker, Paul A. ;
Pereira, Naveen L. ;
Ryu, Euijung ;
Dart, Richard A. ;
Peissig, Peggy ;
Linneman, James G. ;
Jarvik, Gail P. ;
Larson, Eric B. ;
Bock, Jonathan A. ;
Tromp, Gerard C. ;
de Andrade, Mariza ;
Roger, Veronique L. .
JOURNAL OF CARDIOVASCULAR TRANSLATIONAL RESEARCH, 2015, 8 (08) :475-483
[5]   Accuracy of ICD-9-CM Codes by Hospital Characteristics and Stroke Severity: Paul Coverdell National Acute Stroke Program [J].
Chang, Tiffany E. ;
Lichtman, Judith H. ;
Goldstein, Larry B. ;
George, Mary G. .
JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2016, 5 (06)
[6]  
Chen Pei-Fu, 2021, JMIR Med Inform, V9, pe23230, DOI 10.2196/23230
[7]  
Conneau A, 2019, ARXIV, DOI [10.48550/arXiv.1911.02116, DOI 10.48550/ARXIV.1911.02116]
[8]   Temporal trends and patterns in heart failure incidence: a population-based study of 4 million individuals [J].
Conrad, Nathalie ;
Judge, Andrew ;
Tran, Jenny ;
Mohseni, Hamid ;
Hedgecott, Deborah ;
Crespillo, Abel Perez ;
Allison, Moira ;
Hemingway, Harry ;
Cleland, John G. ;
McMurray, John J. V. ;
Rahimi, Kazem .
LANCET, 2018, 391 (10120) :572-580
[9]  
Dadas Slawomir, 2020, Artificial Intelligence and Soft Computing. 19th International Conference, ICAISC 2020. Proceedings. Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science (LNAI 12416), P301, DOI 10.1007/978-3-030-61534-5_27
[10]  
Devlin Jacob, 2018, CoRR