Identification of asthma control factor in clinical notes using a hybrid deep learning model

被引:17
作者
Agnikula Kshatriya, Bhavani Singh [1 ]
Sagheb, Elham [1 ]
Wi, Chung-Il [2 ]
Yoon, Jungwon [3 ]
Seol, Hee Yun [4 ]
Juhn, Young [2 ]
Sohn, Sunghwan [1 ]
机构
[1] Mayo Clin, Dept Artificial Intelligence & Informat, 200 First St SW, Rochester, MN 55905 USA
[2] Mayo Clin, Dept Pediat & Adolescent Med, Precis Populat Sci Lab, Rochester, MN USA
[3] Myongji Hosp, Dept Pediat, Goyang, South Korea
[4] Pusan Natl Univ, Yangsan Hosp, Yangsan, South Korea
关键词
Deep learning; Context-aware language model; Natural language processing; Documentation variations; Adherence to asthma guidelines; Inhaler technique; GUIDELINES; CHILDREN; ASCERTAINMENT; ADHERENCE;
D O I
10.1186/s12911-021-01633-4
中图分类号
R-058 [];
学科分类号
摘要
Background There are significant variabilities in guideline-concordant documentation in asthma care. However, assessing clinician's documentation is not feasible using only structured data but requires labor-intensive chart review of electronic health records (EHRs). A certain guideline element in asthma control factors, such as review inhaler techniques, requires context understanding to correctly capture from EHR free text. Methods The study data consist of two sets: (1) manual chart reviewed data-1039 clinical notes of 300 patients with asthma diagnosis, and (2) weakly labeled data (distant supervision)-27,363 clinical notes from 800 patients with asthma diagnosis. A context-aware language model, Bidirectional Encoder Representations from Transformers (BERT) was developed to identify inhaler techniques in EHR free text. Both original BERT and clinical BioBERT (cBERT) were applied with a cost-sensitivity to deal with imbalanced data. The distant supervision using weak labels by rules was also incorporated to augment the training set and alleviate a costly manual labeling process in the development of a deep learning algorithm. A hybrid approach using post-hoc rules was also explored to fix BERT model errors. The performance of BERT with/without distant supervision, hybrid, and rule-based models were compared in precision, recall, F-score, and accuracy. Results The BERT models on the original data performed similar to a rule-based model in F1-score (0.837, 0.845, and 0.838 for rules, BERT, and cBERT, respectively). The BERT models with distant supervision produced higher performance (0.853 and 0.880 for BERT and cBERT, respectively) than without distant supervision and a rule-based model. The hybrid models performed best in F1-score of 0.877 and 0.904 over the distant supervision on BERT and cBERT. Conclusions The proposed BERT models with distant supervision demonstrated its capability to identify inhaler techniques in EHR free text, and outperformed both the rule-based model and BERT models trained on the original data. With a distant supervision approach, we may alleviate costly manual chart review to generate the large training data required in most deep learning-based models. A hybrid model was able to fix BERT model errors and further improve the performance.
引用
收藏
页数:9
相关论文
共 35 条
[1]  
Alsentzer Emily, 2019, P 2 CLIN NATURAL LAN, P72, DOI [10.18653/v1/W19-1909, DOI 10.18653/V1/W19-1909]
[2]   A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning [J].
Chen, Tao ;
Wu, Mingfen ;
Li, Hexi .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2019,
[3]   Use of asthma guidelines by primary care providers to reduce hospitalizations and emergency department visits in poor minority, urban children [J].
Cloutier, MM ;
Hall, CB ;
Wakefield, DB ;
Bailit, H .
JOURNAL OF PEDIATRICS, 2005, 146 (05) :591-597
[4]  
Costa AD, 2020, MULTIPLE SCLEROSIS S
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]   Improving asthma-related health outcomes among low-income, multiethnic, school-aged children: Results of a demonstration project that combined continuous quality improvement and community health worker strategies [J].
Fox, Patrick ;
Porter, Patricia G. ;
Lob, Sibylle H. ;
Boer, Jennifer Holloman ;
Rocha, David A. ;
Adelson, Joel W. .
PEDIATRICS, 2007, 120 (04) :E902-e911
[7]  
Huang Kexin, 2019, ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
[8]   Artificial intelligence approaches using natural language processing to advance EHR-based clinical research [J].
Juhn, Young ;
Liu, Hongfang .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2020, 145 (02) :463-469
[9]   BertMCN: Mapping colloquial phrases to standard medical concepts using BERT and highway network [J].
Kalyan, Katikapalli Subramanyam ;
Sangeetha, Sivanesan .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 112
[10]   Automated chart review utilizing natural language processing algorithm for asthma predictive index [J].
Kaur, Harsheen ;
Sohn, Sunghwan ;
Wi, Chung-Il ;
Ryu, Euijung ;
Park, Miguel A. ;
Bachman, Kay ;
Kita, Hirohito ;
Croghan, Ivana ;
Castro-Rodriguez, Jose A. ;
Voge, Gretchen A. ;
Liu, Hongfang ;
Juhn, Young J. .
BMC PULMONARY MEDICINE, 2018, 18