A Multi-modal Deep Learning Method for Classifying Chest Radiology Exams

被引：7

作者：

Nunes, Nelson ^{[1
]}

Martins, Bruno ^{[1
]}

da Silva, Nuno Andre ^{[2
]}

Leite, Francisca ^{[2
]}

Silva, Mario J. ^{[1
]}

机构：

[1] Univ Lisbon, Inst Super Tecn, INESC ID, Lisbon, Portugal

[2] Luz Saude, Hosp Luz Learning Hlth, Lisbon, Portugal

来源：

PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2019, PT I | 2019年 / 11804卷

关键词：

Classification of radiology exams; Machine learning in medicine; Learning from multi-modal data; Deep learning;

D O I：

10.1007/978-3-030-30241-2_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-invasive medical imaging techniques, such as radiography or computed tomography, are extensively used in hospitals and clinics for the diagnosis of diverse injuries or diseases. However, the interpretation of these images, which often results in a free-text radiology report and/or a classification, requires specialized medical professionals, leading to high labor costs and waiting lists. Automatic inference of thoracic diseases from the results of chest radiography exams, e.g. for the purpose of indexing these documents, is still a challenging task, even if combining images with the free-text reports. Deep neural architectures can contribute to a more efficient indexing of radiology exams (e.g., associating the data to diagnostic codes), providing interpretable classification results that can guide the domain experts. This work proposes a novel multi-modal approach, combining a dual path convolutional neural network for processing images with a bidirectional recurrent neural network for processing text, enhanced with attention mechanisms and leveraging pre-trained clinical word embeddings. The experimental results show interesting patterns, e.g. validating the high performance of the individual components, and showing promising results for the multi-modal processing of radiology examination data, particularly when pre-training the components of the model with large pre-existing datasets (i.e., a 10% increase in terms of the average value for the areas under the receiver operating characteristic curves).

引用

页码：323 / 335

页数：13

共 33 条

[31] TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Wang, Xiaosong
Peng, Yifan
Lu, Le
Lu, Zhiyong
Summers, Ronald M.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9049 - 9058
[32] Spatiotemporal Pyramid Network for Video Action Recognition
Wang, Yunbo
Long, Mingsheng
Wang, Jianmin
Yu, Philip S.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2097 - 2106
[33] Learning Transferable Architectures for Scalable Image Recognition
Zoph, Barret
Vasudevan, Vijay
Shlens, Jonathon
Le, Quoc V.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8697 - 8710

← 1 2 3 4 →