Classification of Severe Maternal Morbidity from Electronic Health Records Written in Spanish Using Natural Language Processing

被引:4
|
作者
Torres-Silva, Ever A. [1 ]
Rua, Santiago [2 ]
Giraldo-Forero, Andres F. [1 ]
Durango, Maria C. [3 ]
Florez-Arango, Jose F. [4 ]
Orozco-Duque, Andres [3 ]
机构
[1] Inst Tecnol Metropolitano, Fac Engn, Medellin 050034, Colombia
[2] Univ Nacl Abierta & Distancia, Sch Basic Sci Technol & Engn, Bogota 111321, Colombia
[3] Inst Tecnol Metropolitano, Dept Appl Sci, Medellin 050034, Colombia
[4] Weill Cornell Med, Populat Hlth Sci, New York, NY 10065 USA
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 19期
关键词
electronic health records; machine learning; maternal health; pregnancy complications; natural language processing; word-embedding; MACHINE; EMBEDDINGS;
D O I
10.3390/app131910725
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
One stepping stone for reducing the maternal mortality is to identify severe maternal morbidity (SMM) using Electronic Health Records (EHRs). We aim to develop a pipeline to represent and classify the unstructured text of maternal progress notes in eight classes according to the silver labels defined by the ICD-10 codes associated with SMM. We preprocessed the text, removing protected health information (PHI) and reducing stop words. We built different pipelines to classify the SMM by the combination of six word-embeddings schemes, three different approaches for the representation of the documents (average, clustering, and principal component analysis), and five well-known machine learning classifiers. Additionally, we implemented an algorithm for typos and misspelling adjustment based on the Levenshtein distance to the Spanish Billion Word Corpus dictionary. We analyzed 43,529 documents constructed by an average of 4.15 progress notes from 22,937 patients. The pipeline with the best performance was the one that included Word2Vec, typos and spelling adjustment, document representation by PCA, and an SVM classifier. We found that it is possible to identify conditions such as miscarriage complication or hypertensive disorders from clinical notes written in Spanish, with a true positive rate higher than 0.85. This is the first approach to classify SMM from the unstructured text contained in the maternal EHRs, which can contribute to the solution of one of the most important public health problems in the world. Future works must test other representation and classification approaches to detect the risk of SMM.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Natural language processing for electronic health records in anaesthesiology: an introduction to clinicians with recommendations and pitfalls
    Martin Bernstorff
    Simon Tilma Vistisen
    Kenneth C. Enevoldsen
    Journal of Clinical Monitoring and Computing, 2024, 38 : 241 - 245
  • [22] Development of a natural language processing algorithm to detect chronic cough in electronic health records
    Vishal Bali
    Jessica Weaver
    Vladimir Turzhitsky
    Jonathan Schelfhout
    Misti L. Paudel
    Erin Hulbert
    Jesse Peterson-Brandt
    Anne-Marie Guerra Currie
    Dylan Bakka
    BMC Pulmonary Medicine, 22
  • [23] Development of a natural language processing algorithm to detect chronic cough in electronic health records
    Bali, Vishal
    Weaver, Jessica
    Turzhitsky, Vladimir
    Schelfhout, Jonathan
    Paudel, Misti L.
    Hulbert, Erin
    Peterson-Brandt, Jesse
    Currie, Anne-Marie Guerra
    Bakka, Dylan
    BMC PULMONARY MEDICINE, 2022, 22 (01)
  • [24] Natural language processing for electronic health records in anaesthesiology: an introduction to clinicians with recommendations and pitfalls
    Bernstorff, Martin
    Vistisen, Simon Tilma
    Enevoldsen, Kenneth C.
    JOURNAL OF CLINICAL MONITORING AND COMPUTING, 2024, 38 (02) : 241 - 245
  • [25] Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
    Ashburner, Jeffrey M.
    Chang, Yuchiao
    Wang, Xin
    Khurshid, Shaan
    Anderson, Christopher D.
    Dahal, Kumar
    Weisenfeld, Dana
    Cai, Tianrun
    Liao, Katherine P.
    Wagholikar, Kavishwar B.
    Murphy, Shawn N.
    Atlas, Steven J.
    Lubitz, Steven A.
    Singer, Daniel E.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2022, 11 (15):
  • [26] Identifying Information Gaps in Electronic Health Records by Using Natural Language Processing: Gynecologic Surgery History Identification
    Moon, Sungrim
    Carlson, Luke A.
    Moser, Ethan D.
    Kshatriya, Bhavani Singh Agnikula
    Smith, Carin Y.
    Rocca, Walter A.
    Rocca, Liliana Gazzuola
    Bielinski, Suzette J.
    Liu, Hongfang
    Larson, Nicholas B.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (01)
  • [27] Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts
    Tsui, Fuchiang R.
    Shi, Lingyun
    Ruiz, Victor
    Ryan, Neal D.
    Biernesser, Candice
    Iyengar, Satish
    Walsh, Colin G.
    Brent, David A.
    JAMIA OPEN, 2021, 4 (01)
  • [28] A series of natural language processing for predicting tumor response evaluation and survival curve from electronic health records
    Takeuchi, Toshiki
    Horinouchi, Hidehito
    Takasawa, Ken
    Mukai, Masami
    Masuda, Ken
    Shinno, Yuki
    Okuma, Yusuke
    Yoshida, Tatsuya
    Goto, Yasushi
    Yamamoto, Noboru
    Ohe, Yuichiro
    Miyake, Mototaka
    Watanabe, Hirokazu
    Kusumoto, Masahiko
    Aoki, Takashi
    Nishimura, Kunihiro
    Hamamoto, Ryuji
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [29] Natural language processing methods for assessing social determinants of health in the electronic health records: A narrative review
    Abulibdeh, Rawan
    Tu, Karen
    Sejdic, Ervin
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 284
  • [30] Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records
    Zheng, Chengyi
    Lee, Ming-sum
    Bansal, Nisha
    Go, Alan S.
    Chen, Cheng
    Harrison, Teresa N.
    Fan, Dongjie
    Allen, Amanda
    Garcia, Elisha
    Lidgard, Ben
    Singer, Daniel
    An, Jaejin
    EUROPEAN HEART JOURNAL-QUALITY OF CARE AND CLINICAL OUTCOMES, 2024, 10 (01) : 77 - 88