Medical Image Segmentation via Cascaded Attention Decoding

被引:168
作者
Rahman, Md Mostafijur [1 ]
Marculescu, Radu [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
关键词
NETWORK;
D O I
10.1109/WACV56688.2023.00616
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers have shown great promise in medical image segmentation due to their ability to capture long-range dependencies through self-attention. However, they lack the ability to learn the local (contextual) relations among pixels. Previous works try to overcome this problem by embedding convolutional layers either in the encoder or decoder modules of transformers thus ending up sometimes with inconsistent features. To address this issue, we propose a novel attention-based decoder, namely CASCaded Attention DEcoder (CASCADE), which leverages the multi-scale features of hierarchical vision transformers. CASCADE consists of i) an attention gate which fuses features with skip connections and ii) a convolutional attention module that enhances the long-range and local context by suppressing background information. We use a multi-stage feature and loss aggregation framework due to their faster convergence and better performance. Our experiments demonstrate that transformers with CASCADE significantly outperform state-of-the-art CNN- and transformer-based approaches, obtaining up to 5.07% and 6.16% improvements in DICE and mIoU scores, respectively. CASCADE opens new ways of designing better attention-based decoders.
引用
收藏
页码:6211 / 6220
页数:10
相关论文
共 37 条
[1]  
[Anonymous], 2021, P 29 ACM INT C MULT, DOI DOI 10.1109/SNPDWINTER52325.2021.00044
[2]   WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].
Bernal, Jorge ;
Javier Sanchez, F. ;
Fernandez-Esparrach, Gloria ;
Gil, Debora ;
Rodriguez, Cristina ;
Vilarino, Fernando .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111
[3]  
Cao H., 2021, arXiv:2105.05537
[4]  
Chen J., 2021, arXiv:2102.04306
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[7]   Reverse Attention for Salient Object Detection [J].
Chen, Shuhan ;
Tan, Xiuli ;
Wang, Ben ;
Hu, Xuelong .
COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 :236-252
[8]  
Chu Xiangxiang, 2021, ARXIV210210882
[9]  
Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26
[10]  
Dong B., 2021, arXiv