A Symmetric Efficient Spatial and Channel Attention (ESCA) Module Based on Convolutional Neural Networks

被引:1
作者
Liu, Huaiyu [1 ]
Zhang, Yueyuan [1 ]
Chen, Yiyang [1 ]
机构
[1] Soochow Univ, Sch Mech & Elect Engn, Suzhou 215137, Peoples R China
来源
SYMMETRY-BASEL | 2024年 / 16卷 / 08期
关键词
deep learning; attention mechanisms; symmetric; computer vision; image classification; object detection;
D O I
10.3390/sym16080952
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In recent years, attention mechanisms have shown great potential in various computer vision tasks. However, most existing methods focus on developing more complex attention modules for better performance, which inevitably increases the complexity of the model. To overcome performance and complexity tradeoffs, this paper proposes efficient spatial and channel attention (ESCA), a symmetric, comprehensive, and efficient attention module. By analyzing squeeze-and-excitation (SE), convolutional block attention module (CBAM), coordinate attention (CA), and efficient channel attention (ECA) modules, we abandon the dimension-reduction operation of SE module, verify the negative impact of global max pooling (GMP) on the model, and apply a local cross-channel interaction strategy without dimension reduction to learn attention. We not only care about the channel features of the image, we also care about the spatial location of the target on the image, and we take into account the effectiveness of channel attention, so we designed the symmetric ESCA module. The ESCA module is effective, as demonstrated by its application in the ResNet-50 classification benchmark. With 26.26 M parameters and 8.545 G FLOPs, it introduces a mere 0.14% increment in FLOPs while achieving over 6.33% improvement in Top-1 accuracy and exceeding 3.25% gain in Top-5 accuracy. We perform image classification and object detection tasks on ResNet, MobileNet, YOLO, and other architectures on popular datasets such as Mini ImageNet, CIFAR-10, and VOC 2007. Experiments show that ESCA can achieve great improvement in model accuracy at a very small cost, and it performs well among similar models.
引用
收藏
页数:15
相关论文
共 39 条
[31]  
Szegedy C, 2015, PROC CVPR IEEE, P1, DOI 10.1109/CVPR.2015.7298594
[32]   Blind Image Quality Assessment via Adaptive Graph Attention [J].
Wang, Huasheng ;
Liu, Jiang ;
Tan, Hongchen ;
Lou, Jianxun ;
Liu, Xiaochang ;
Zhou, Wei ;
Liu, Hantao .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) :10299-10309
[33]  
Wang QL, 2020, PROC CVPR IEEE, P11531, DOI 10.1109/CVPR42600.2020.01155
[34]   Non-local Neural Networks [J].
Wang, Xiaolong ;
Girshick, Ross ;
Gupta, Abhinav ;
He, Kaiming .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7794-7803
[35]   CBAM: Convolutional Block Attention Module [J].
Woo, Sanghyun ;
Park, Jongchan ;
Lee, Joon-Young ;
Kweon, In So .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19
[36]   Aggregated Residual Transformations for Deep Neural Networks [J].
Xie, Saining ;
Girshick, Ross ;
Dollar, Piotr ;
Tu, Zhuowen ;
He, Kaiming .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5987-5995
[37]   An Effective Image Classification Method for Plant Diseases with Improved Channel Attention Mechanism aECAnet Based on Deep Learning [J].
Yang, Wenqiang ;
Yuan, Ying ;
Zhang, Donghua ;
Zheng, Liyuan ;
Nie, Fuquan .
SYMMETRY-BASEL, 2024, 16 (04)
[38]  
Zagoruyko S, 2017, Arxiv, DOI arXiv:1605.07146
[39]   SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS [J].
Zhang, Qing-Long ;
Yang, Yu-Bin .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :2235-2239