MEDA: Multi-output Encoder-Decoder for Spatial Attention in Convolutional Neural Networks

被引:0
作者
Li, Huayu [1 ]
Razi, Abolfazl [1 ]
机构
[1] No Arizona Univ, Sch Informat Comp & Cyber Syst, Flagstaff, AZ 86011 USA
来源
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS | 2019年
关键词
Attention Mechanism; Deep Learning; Encoder-Decoder Architecture; Convolutional Networks;
D O I
10.1109/ieeeconf44664.2019.9048981
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Utilizing channel-wise spatial attention mechanisms to emphasize special parts of an input image is an effective method to improve the performance of convolutional neural networks (CNNs). There are multiple effective implementations of attention mechanism. One is adding squeeze-and-excitation (SE) blocks to the CNN structure that selectively emphasize the most informative channels and suppress the relatively less informative channels by taking advantage of channel dependence. Another method is adding convolutional block attention module (CBAM) to implement both channel-wise and spatial attention mechanisms to select important pixels of the feature maps while emphasizing informative channels. In this paper, we propose an encoder-decoder architecture based on the idea of letting the channel-wise and spatial attention blocks share the same latent space representation. Instead of separating the channel-wise and spatial attention modules into two independent parts in CBAM, we combine them into one encoder-decoder architecture with two outputs. To evaluate the performance of the proposed algorithm, we apply it to different CNN architectures and test it on image classification and semantic segmentation. Through comparing the resulting structure equipped with MEDA blocks against other attention module, we show that the proposed method achieves better performance across different test scenarios.
引用
收藏
页码:2087 / 2091
页数:5
相关论文
共 37 条
  • [1] Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks
    Cao, Chunshui
    Liu, Xianming
    Yang, Yi
    Yu, Yinan
    Wang, Jiang
    Wang, Zilei
    Huang, Yongzhen
    Wang, Liang
    Huang, Chang
    Xu, Wei
    Ramanan, Deva
    Huang, Thomas S.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2956 - 2964
  • [2] A Hybrid Task Scheduling Scheme for Heterogeneous Vehicular Edge Systems
    Chen, Xiao
    Thomas, Nigel
    Zhan, Tianming
    Ding, Jie
    [J]. IEEE ACCESS, 2019, 7 : 117088 - 117099
  • [3] Cheng H T, 2016, P 1 WORKSH DEEP LEAR, P7
  • [4] Cheng J., 2019, IEEE EMBS INT C BIOM
  • [5] Chollet F., 2015, Keras
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] Image processing with neural networks - a review
    Egmont-Petersen, M
    de Ridder, D
    Handels, H
    [J]. PATTERN RECOGNITION, 2002, 35 (10) : 2279 - 2301
  • [8] Finley M, 2019, 2019 IEEE 9TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), P401, DOI 10.1109/CCWC.2019.8666620
  • [9] He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
  • [10] Hinton G. E., 1994, Advances in Neural Information Processing Systems, P3