A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams

被引:7
作者
Nunes, Nelson [1 ]
Martins, Bruno [1 ]
da Silva, Nuno Andre [2 ]
Leite, Francisca [2 ]
Silva, Mario J. [1 ]
机构
[1] Univ Lisbon, Inst Super Tecn, INESC ID, Lisbon, Portugal
[2] Luz Saude, Hosp Luz Learning Hlth, Lisbon, Portugal
来源
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2019, PT I | 2019年 / 11804卷
关键词
Classification of radiology exams; Machine learning in medicine; Learning from multi-modal data; Deep learning;
D O I
10.1007/978-3-030-30241-2_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-invasive medical imaging techniques, such as radiography or computed tomography, are extensively used in hospitals and clinics for the diagnosis of diverse injuries or diseases. However, the interpretation of these images, which often results in a free-text radiology report and/or a classification, requires specialized medical professionals, leading to high labor costs and waiting lists. Automatic inference of thoracic diseases from the results of chest radiography exams, e.g. for the purpose of indexing these documents, is still a challenging task, even if combining images with the free-text reports. Deep neural architectures can contribute to a more efficient indexing of radiology exams (e.g., associating the data to diagnostic codes), providing interpretable classification results that can guide the domain experts. This work proposes a novel multi-modal approach, combining a dual path convolutional neural network for processing images with a bidirectional recurrent neural network for processing text, enhanced with attention mechanisms and leveraging pre-trained clinical word embeddings. The experimental results show interesting patterns, e.g. validating the high performance of the individual components, and showing promising results for the multi-modal processing of radiology examination data, particularly when pre-training the components of the model with large pre-existing datasets (i.e., a 10% increase in terms of the average value for the areas under the receiver operating characteristic curves).
引用
收藏
页码:323 / 335
页数:13
相关论文
共 33 条
  • [31] TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
    Wang, Xiaosong
    Peng, Yifan
    Lu, Le
    Lu, Zhiyong
    Summers, Ronald M.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9049 - 9058
  • [32] Spatiotemporal Pyramid Network for Video Action Recognition
    Wang, Yunbo
    Long, Mingsheng
    Wang, Jianmin
    Yu, Philip S.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2097 - 2106
  • [33] Learning Transferable Architectures for Scalable Image Recognition
    Zoph, Barret
    Vasudevan, Vijay
    Shlens, Jonathon
    Le, Quoc V.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8697 - 8710