A Symmetric Efficient Spatial and Channel Attention (ESCA) Module Based on Convolutional Neural Networks

被引:0
作者
Liu, Huaiyu [1 ]
Zhang, Yueyuan [1 ]
Chen, Yiyang [1 ]
机构
[1] Soochow Univ, Sch Mech & Elect Engn, Suzhou 215137, Peoples R China
来源
SYMMETRY-BASEL | 2024年 / 16卷 / 08期
关键词
deep learning; attention mechanisms; symmetric; computer vision; image classification; object detection;
D O I
10.3390/sym16080952
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In recent years, attention mechanisms have shown great potential in various computer vision tasks. However, most existing methods focus on developing more complex attention modules for better performance, which inevitably increases the complexity of the model. To overcome performance and complexity tradeoffs, this paper proposes efficient spatial and channel attention (ESCA), a symmetric, comprehensive, and efficient attention module. By analyzing squeeze-and-excitation (SE), convolutional block attention module (CBAM), coordinate attention (CA), and efficient channel attention (ECA) modules, we abandon the dimension-reduction operation of SE module, verify the negative impact of global max pooling (GMP) on the model, and apply a local cross-channel interaction strategy without dimension reduction to learn attention. We not only care about the channel features of the image, we also care about the spatial location of the target on the image, and we take into account the effectiveness of channel attention, so we designed the symmetric ESCA module. The ESCA module is effective, as demonstrated by its application in the ResNet-50 classification benchmark. With 26.26 M parameters and 8.545 G FLOPs, it introduces a mere 0.14% increment in FLOPs while achieving over 6.33% improvement in Top-1 accuracy and exceeding 3.25% gain in Top-5 accuracy. We perform image classification and object detection tasks on ResNet, MobileNet, YOLO, and other architectures on popular datasets such as Mini ImageNet, CIFAR-10, and VOC 2007. Experiments show that ESCA can achieve great improvement in model accuracy at a very small cost, and it performs well among similar models.
引用
收藏
页数:15
相关论文
共 39 条
  • [1] Generating Image Captions Using Bahdanau Attention Mechanism and Transfer Learning
    Ayoub, Shahnawaz
    Gulzar, Yonis
    Reegu, Faheem Ahmad
    Turaev, Sherzod
    [J]. SYMMETRY-BASEL, 2022, 14 (12):
  • [2] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
    Cao, Yue
    Xu, Jiarui
    Lin, Stephen
    Wei, Fangyun
    Hu, Han
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1971 - 1980
  • [3] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    Nie, Liqiang
    Shao, Jian
    Liu, Wei
    Chua, Tat-Seng
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6298 - 6306
  • [4] Chen YP, 2018, ADV NEUR IN, V31
  • [5] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [6] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [7] 3Global Second-order Pooling Convolutional Networks
    Gao, Zilin
    Xie, Jiangtao
    Wang, Qilong
    Li, Peihua
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3019 - 3028
  • [8] Goyal A., 2022, P 36 INT C NEUR INF, V35, P6789, DOI 10.48550/arXiv.2110.07641
  • [9] Cloud Detection for Satellite Imagery Using Attention-Based U-Net Convolutional Neural Network
    Guo, Yanan
    Cao, Xiaoqun
    Liu, Bainian
    Gao, Mei
    [J]. SYMMETRY-BASEL, 2020, 12 (06):
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778