Multi-Modal Retinal Image Classification With Modality-Specific Attention Network

被引:66
作者
He, Xingxin [1 ]
Deng, Ying [2 ]
Fang, Leyuan [1 ]
Peng, Qinghua [2 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Hunan, Peoples R China
[2] Hunan Univ Chinese Med, Dept Ophthalmol, Hosp 1, Changsha 410082, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Retina; Deep learning; Feature extraction; Biomedical imaging; Optical imaging; Image segmentation; Training; Fundus photography; optical coherence tomography; classification; multi-modal; attention; convolutional neural network; COHERENCE TOMOGRAPHY IMAGES; DIABETIC-RETINOPATHY; LEARNING ALGORITHM; GLAUCOMA; DISEASES; EDEMA;
D O I
10.1109/TMI.2021.3059956
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recently, automatic diagnostic approaches have been widely used to classify ocular diseases. Most of these approaches are based on a single imaging modality (e.g., fundus photography or optical coherence tomography (OCT)), which usually only reflect the oculopathy to a certain extent, and neglect the modality-specific information among different imaging modalities. This paper proposes a novel modality-specific attention network (MSAN) for multi-modal retinal image classification, which can effectively utilize the modality-specific diagnostic features from fundus and OCT images. The MSAN comprises two attention modules to extract the modality-specific features from fundus and OCT images, respectively. Specifically, for the fundus image, ophthalmologists need to observe local and global pathologies at multiple scales (e.g., from microaneurysms at the micrometer level, optic disc at millimeter level to blood vessels through the whole eye). Therefore, we propose a multi-scale attention module to extract both the local and global features from fundus images. Moreover, large background regions exist in the OCT image, which is meaningless for diagnosis. Thus, a region-guided attention module is proposed to encode the retinal layer-related features and ignore the background in OCT images. Finally, we fuse the modality-specific features to form a multi-modal feature and train the multi-modal retinal image classification network. The fusion of modality-specific features allows the model to combine the advantages of fundus and OCT modality for a more accurate diagnosis. Experimental results on a clinically acquired multi-modal retinal image (fundus and OCT) dataset demonstrate that our MSAN outperforms other well-known single-modal and multi-modal retinal image classification methods.
引用
收藏
页码:1591 / 1602
页数:12
相关论文
共 71 条
[51]  
Simonyan K., P 3 INT C LEARNING R
[52]   Image processing based automatic diagnosis of glaucoma using wavelet features of segmented optic disc from fundus image [J].
Singh, Anushikha ;
Dutta, Malay Kishore ;
ParthaSarathi, M. ;
Uher, Vaclav ;
Burget, Radim .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2016, 124 :108-120
[53]   AUNet: attention-guided dense-upsampling networks for breast mass segmentation in whole mammograms [J].
Sun, Hui ;
Li, Cheng ;
Liu, Boqiang ;
Liu, Zaiyi ;
Wang, Meiyun ;
Zheng, Hairong ;
Dagan Feng, David ;
Wang, Shanshan .
PHYSICS IN MEDICINE AND BIOLOGY, 2020, 65 (05)
[54]   Rethinking the Inception Architecture for Computer Vision [J].
Szegedy, Christian ;
Vanhoucke, Vincent ;
Ioffe, Sergey ;
Shlens, Jon ;
Wojna, Zbigniew .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826
[55]   Lix Edward Jackson memorial lecture - Pathologic myopia: Where are we now? [J].
Tano, Y .
AMERICAN JOURNAL OF OPHTHALMOLOGY, 2002, 134 (05) :645-660
[56]  
Tennakoon R., 2017, PROC MICCAI RETINAL, P30
[57]   Automated detection of exudative age-related macular degeneration in spectral domain optical coherence tomography using deep learning [J].
Treder, Maximilian ;
Lauermann, Jost Lennart ;
Eter, Nicole .
GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2018, 256 (02) :259-265
[58]   Computer Vision Techniques Applied for Diagnostic Analysis of Retinal OCT Images: A Review [J].
Usman, Muhammad ;
Fraz, Muhammad Moazam ;
Barman, Sarah A. .
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2017, 24 (03) :449-465
[59]  
Vaswani A, 2017, ADV NEUR IN, V30
[60]  
WANG F, 2017, P IEEE C COMP VIS PA, P3156, DOI DOI 10.1109/CVPR.2017.683