A Symmetric Efficient Spatial and Channel Attention (ESCA) Module Based on Convolutional Neural Networks

被引：0

作者：

Liu, Huaiyu ^{[1
]}

Zhang, Yueyuan ^{[1
]}

Chen, Yiyang ^{[1
]}

机构：

[1] Soochow Univ, Sch Mech & Elect Engn, Suzhou 215137, Peoples R China

来源：

SYMMETRY-BASEL | 2024年 / 16卷 / 08期

关键词：

deep learning; attention mechanisms; symmetric; computer vision; image classification; object detection;

D O I：

10.3390/sym16080952

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

In recent years, attention mechanisms have shown great potential in various computer vision tasks. However, most existing methods focus on developing more complex attention modules for better performance, which inevitably increases the complexity of the model. To overcome performance and complexity tradeoffs, this paper proposes efficient spatial and channel attention (ESCA), a symmetric, comprehensive, and efficient attention module. By analyzing squeeze-and-excitation (SE), convolutional block attention module (CBAM), coordinate attention (CA), and efficient channel attention (ECA) modules, we abandon the dimension-reduction operation of SE module, verify the negative impact of global max pooling (GMP) on the model, and apply a local cross-channel interaction strategy without dimension reduction to learn attention. We not only care about the channel features of the image, we also care about the spatial location of the target on the image, and we take into account the effectiveness of channel attention, so we designed the symmetric ESCA module. The ESCA module is effective, as demonstrated by its application in the ResNet-50 classification benchmark. With 26.26 M parameters and 8.545 G FLOPs, it introduces a mere 0.14% increment in FLOPs while achieving over 6.33% improvement in Top-1 accuracy and exceeding 3.25% gain in Top-5 accuracy. We perform image classification and object detection tasks on ResNet, MobileNet, YOLO, and other architectures on popular datasets such as Mini ImageNet, CIFAR-10, and VOC 2007. Experiments show that ESCA can achieve great improvement in model accuracy at a very small cost, and it performs well among similar models.

引用

页数：15

共 39 条

[1] Generating Image Captions Using Bahdanau Attention Mechanism and Transfer Learning
Ayoub, Shahnawaz
Gulzar, Yonis
Reegu, Faheem Ahmad
Turaev, Sherzod
[J]. SYMMETRY-BASEL, 2022, 14 (12):
[2] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Cao, Yue
Xu, Jiarui
Lin, Stephen
Wei, Fangyun
Hu, Han
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1971 - 1980
[3] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Chen, Long
Zhang, Hanwang
Xiao, Jun
Nie, Liqiang
Shao, Jian
Liu, Wei
Chua, Tat-Seng
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6298 - 6306
[4] Chen YP, 2018, ADV NEUR IN, V31
[5] Xception: Deep Learning with Depthwise Separable Convolutions
Chollet, Francois
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
[6] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[7] 3Global Second-order Pooling Convolutional Networks
Gao, Zilin
Xie, Jiangtao
Wang, Qilong
Li, Peihua
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3019 - 3028
[8] Goyal A., 2022, P 36 INT C NEUR INF, V35, P6789, DOI 10.48550/arXiv.2110.07641
[9] Cloud Detection for Satellite Imagery Using Attention-Based U-Net Convolutional Neural Network
Guo, Yanan
Cao, Xiaoqun
Liu, Bainian
Gao, Mei
[J]. SYMMETRY-BASEL, 2020, 12 (06):
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 4 →