Labeling Chest X-Ray Reports Using Deep Learning

被引:1
作者
Monshi, Maram Mahmoud A. [1 ,2 ]
Poon, Josiah [1 ]
Chung, Vera [1 ]
Monshi, Fahad Mahmoud [3 ]
机构
[1] Univ Sydney, Sch Comp Sci, Camperdown, NSW 2006, Australia
[2] Taif Univ, Dept Informat Technol, At Taif 26571, Saudi Arabia
[3] King Saud Univ Med City, Radiol & Med Imaging Dept, Riyadh 12746, Saudi Arabia
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III | 2021年 / 12893卷
关键词
Chest X-Ray report; Natural Language Processing; Recurrent neural network; CHEXPERT;
D O I
10.1007/978-3-030-86365-4_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the primary challenges in the development of Chest X-Ray (CXR) interpretation models has been the lack of large datasets with multilabel image annotations extracted from radiology reports. This paper proposes a CXR labeler that can simultaneously extracts fourteen observations from free-text radiology reports as positive or negative, abbreviated as CXRlabeler. It fine-tunes a pre-trained language model, AWD-LSTM, to the corpus of CXR radiology impressions and then uses it as the base of the multilabel classifier. Experimentation demonstrates that a language model fine-tuning increases the classifier F1 score by 12.53%. Overall, CXRlabeler achieves a 96.17% F1 score on the MIMIC-CXR dataset. To further test the generalization of the CXRlabeler model, it is tested on the PadChest dataset. This testing shows that the CXR-labeler approach is helpful in a different language environment, and the model (available at https://github.com/MaramMonshi/CXRlabeler) can assist researchers in labeling CXR datasets with fourteen observations.
引用
收藏
页码:684 / 694
页数:11
相关论文
共 27 条
[1]  
Alsentzer Emily, 2019, P 2 CLIN NATURAL LAN, P72, DOI [10.18653/v1/W19-1909, DOI 10.18653/V1/W19-1909]
[2]   An overview of MetaMap: historical perspective and recent advances [J].
Aronson, Alan R. ;
Lang, Francois-Michel .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) :229-236
[3]  
Becker C, 2020, CHAPTER 7 TRANSFER L
[4]   The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[5]   PadChest: A large chest x-ray image dataset with multi-label annotated reports [J].
Bustos, Aurelia ;
Pertusa, Antonio ;
Salinas, Jose-Maria ;
de la Iglesia-Vaya, Maria .
MEDICAL IMAGE ANALYSIS, 2020, 66
[6]   Preparing a collection of radiology examinations for distribution and retrieval [J].
Demner-Fushman, Dina ;
Kohli, Marc D. ;
Rosenman, Marc B. ;
Shooshan, Sonya E. ;
Rodriguez, Laritza ;
Antani, Sameer ;
Thoma, George R. ;
McDonald, Clement J. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (02) :304-310
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Harsha Kadam S., 2020, OPEN DIGITAL REPOSIT
[9]   Fastai: A Layered API for Deep Learning [J].
Howard, Jeremy ;
Gugger, Sylvain .
INFORMATION, 2020, 11 (02)
[10]  
Howard J, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P328