A Deep Learning Based Approach to Automate Clinical Coding of Electronic Health Records

被引:0
|
作者
Kumar, Ashutosh [1 ]
Rathore, Santosh Singh [1 ]
机构
[1] ABV Indian Inst Informat Technol & Management, Dept Comp Sci & Engn, Gwalior, India
来源
BIG DATA ANALYTICS, BDA 2022 | 2022年 / 13773卷
关键词
Machine learning; Electronic health record; Clinical coding; Cosine similarity; Transformers;
D O I
10.1007/978-3-031-24094-2_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The medical records in different electronic formats, such as handwritten notes, diagnosis summaries, lab reports, electronic pdfs, etc., contain valuable information that can be used for various medical purposes. These health records are currently coded manually or semi-automated to assign clinical codes (ICD-codes) for clinical research and analytics. This process is very time-consuming, expensive, and error-prone. This paper presents a method for automated clinical coding of electronic health records (EHRs) given the patient diagnosis summary and other medical-related documents. The presented method uses natural language processing (NLP) techniques, which capture knowledge from the free-text diagnosis descriptions, do the text matching and semantic mapping, and translate diagnosis descriptions into clinical codes. We develop one baseline Word2vec and cosine similarity hybrid model, a transformer encoder model, and a BERT (Bidirectional Encoder Representations from Transformers) model for the automated clinical coding. The presented models are evaluated using a publicly available Medical Information Mart for Intensive Care III (MIMIC-III) dataset. The used dataset consists of various patient diagnosis descriptions and corresponding ICD-9 codes. The experimental results show that the presented BlueBERT based automated clinical coding model produced an AUC (area under ROC curve) value of 98.9% for the top-10 ICD codes prediction. On the full MIMIC-III dataset, the transformer model produced an accuracy of 76.8%, a precision of 61.02%, a recall of 47.22%, a f1-score of 53.2%, and an AUC value of 92.1%. The hybrid baseline model and another used transformer encoder model also showed promising results.
引用
收藏
页码:104 / 116
页数:13
相关论文
共 50 条
  • [1] Deep Learning-Based Natural Language Processing to Automate Esophagitis Severity Grading from the Electronic Health Records
    Chen, S.
    Guevara, M.
    Ramirez, N.
    Aerts, H.
    Miller, T. A.
    Savova, G. K.
    Mak, R. H.
    Bitterman, D. S.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2023, 117 (02): : S18 - S18
  • [2] A Novel Deep Similarity Learning Approach to Electronic Health Records Data
    Gupta, Vagisha
    Sachdeva, Shelly
    Bhalla, Subhash
    IEEE ACCESS, 2020, 8 : 209278 - 209295
  • [3] Deep Learning for Electronic Health Records Analytics
    Harerimana, Gaspard
    Kim, Jong Wook
    Yoo, Hoon
    Jang, Beakcheol
    IEEE ACCESS, 2019, 7 : 101245 - 101259
  • [4] A Survey of Deep Learning for Electronic Health Records
    Xu, Jiabao
    Xi, Xuefeng
    Chen, Jie
    Sheng, Victor S.
    Ma, Jieming
    Cui, Zhiming
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [5] An Integrated Approach for Analysis of Electronic Health Records Using Blockchain and Deep Learning
    Singhal P.
    Gupta S.
    Deepak
    Singh J.
    Recent Advances in Computer Science and Communications, 2023, 16 (09)
  • [6] A Deep Learning Approach to Predict Neonatal Encephalopathy from Electronic Health Records
    Gao, Cheng
    Yan, Chao
    Osmundson, Sarah
    Malin, Bradley A.
    Chen, You
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 170 - 176
  • [7] Clinical Coding Support Based on Structured Data Stored in Electronic Health Records
    Ferrao, Jose C.
    Oliveira, Monica D.
    Janela, Filipe
    Martins, Henrique M. G.
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [8] A Regularized Deep Learning Approach for Clinical Risk Prediction of Acute Coronary Syndrome Using Electronic Health Records
    Huang, Zhengxing
    Dong, Wei
    Duan, Huilong
    Liu, Jiquan
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (05) : 956 - 968
  • [9] Scalable and accurate deep learning with electronic health records
    Alvin Rajkomar
    Eyal Oren
    Kai Chen
    Andrew M. Dai
    Nissan Hajaj
    Michaela Hardt
    Peter J. Liu
    Xiaobing Liu
    Jake Marcus
    Mimi Sun
    Patrik Sundberg
    Hector Yee
    Kun Zhang
    Yi Zhang
    Gerardo Flores
    Gavin E. Duggan
    Jamie Irvine
    Quoc Le
    Kurt Litsch
    Alexander Mossin
    Justin Tansuwan
    De Wang
    James Wexler
    Jimbo Wilson
    Dana Ludwig
    Samuel L. Volchenboum
    Katherine Chou
    Michael Pearson
    Srinivasan Madabushi
    Nigam H. Shah
    Atul J. Butte
    Michael D. Howell
    Claire Cui
    Greg S. Corrado
    Jeffrey Dean
    npj Digital Medicine, 1
  • [10] Scalable and accurate deep learning with electronic health records
    Rajkomar, Alvin
    Oren, Eyal
    Chen, Kai
    Dai, Andrew M.
    Hajaj, Nissan
    Hardt, Michaela
    Liu, Peter J.
    Liu, Xiaobing
    Marcus, Jake
    Sun, Mimi
    Sundberg, Patrik
    Yee, Hector
    Zhang, Kun
    Zhang, Yi
    Flores, Gerardo
    Duggan, Gavin E.
    Irvine, Jamie
    Quoc Le
    Litsch, Kurt
    Mossin, Alexander
    Tansuwan, Justin
    Wang, De
    Wexler, James
    Wilson, Jimbo
    Ludwig, Dana
    Volchenboum, Samuel L.
    Chou, Katherine
    Pearson, Michael
    Madabushi, Srinivasan
    Shah, Nigam H.
    Butte, Atul J.
    Howell, Michael D.
    Cui, Claire
    Corrado, Greg S.
    Dean, Jeffrey
    NPJ DIGITAL MEDICINE, 2018, 1