Improving Chest X-Ray Report Generation by Leveraging Warm Starting

被引:24
作者
Nicolson, Aaron [1 ]
Dowling, Jason [1 ]
Koopman, Bevan [1 ]
机构
[1] CSIRO Hlth & Biosecur, Australian eHlth Res Ctr, Brisbane, Australia
关键词
Chest X-ray report generation; Image captioning; Multi-modal learning warm starting; ARTIFICIAL-INTELLIGENCE; RADIOLOGY;
D O I
10.1016/j.artmed.2023.102633
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatically generating a report from a patient's Chest X-Rays (CXRs) is a promising solution to reducing clinical workload and improving patient care. However, current CXR report generators -- which are predominantly encoder-to-decoder models -- lack the diagnostic accuracy to be deployed in a clinical setting. To improve CXR report generation, we investigate warm starting the encoder and decoder with recent open-source computer vision and natural language processing checkpoints, such as the Vision Transformer (ViT) and PubMedBERT. To this end, each checkpoint is evaluated on the MIMIC-CXR and IU X-Ray datasets. Our experimental investigation demonstrates that the Convolutional vision Transformer (CvT) ImageNet-21K and the Distilled Generative Pre-trained Transformer 2 (DistilGPT2) checkpoints are best for warm starting the encoder and decoder, respectively. Compared to the state-of-the-art (M2 Transformer Progressive), CvT2DistilGPT2 attained an improvement of 8.3\% for CE F-1, 1.8\% for BLEU-4, 1.6\% for ROUGE-L, and 1.0\% for METEOR. The reports generated by CvT2DistilGPT2 have a higher similarity to radiologist reports than previous approaches. This indicates that leveraging warm starting improves CXR report generation. Code and checkpoints for CvT2DistilGPT2 are available at this https://github.com/achre/cvt2distiglgpt2.
引用
收藏
页数:17
相关论文
共 85 条
  • [1] An Intelligent Future for Medical Imaging: A Market Outlook on Artificial Intelligence for Medical Imaging
    Alexander, Alan
    Jiang, Adam
    Ferreira, Cara
    Zurkiya, Delphine
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2020, 17 (01) : 165 - 170
  • [2] Alfarghaly Omar, 2021, Informatics in Medicine Unlocked, V24, DOI 10.1016/j.imu.2021.100557
  • [3] Alsentzer E., 2019, P CLIN NATURAL LANGU, DOI [DOI 10.18653/V1/W19-1909, 10.18653]
  • [4] Automatic medical image interpretation: State of t he art and future directions
    Ayesha, Hareem
    Iqbal, Sajid
    Tariq, Mehreen
    Abrar, Muhammad
    Sanaullah, Muhammad
    Abbas, Ishaq
    Rehman, Amjad
    Niazi, Muhammad Farooq Khan
    Hussain, Shafiq
    [J]. PATTERN RECOGNITION, 2021, 114
  • [5] Evaluating diagnostic content of AI-generated radiology reports of chest X-rays
    Babar, Zaheer
    van Laarhoven, Twan
    Zanzotto, Fabio Massimo
    Marchiori, Elena
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 116 (116)
  • [6] Variability in interpretation of chest radiographs among Russian clinicians and implications for screening programmes: observational study
    Balabanova, Y
    Coker, R
    Fedorin, I
    Zakharova, S
    Plavinskij, S
    Krukov, N
    Atun, R
    Drobniewski, F
    [J]. BMJ-BRITISH MEDICAL JOURNAL, 2005, 331 (7513): : 379 - +
  • [7] Banerjee S, 2005, P ACL WORKSHOP INTRI, P65, DOI DOI 10.3115/1626355.1626389
  • [8] Bao H., 2021, arXiv
  • [9] Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
  • [10] Chen XL, 2015, Arxiv, DOI arXiv:1504.00325