SPAM: Spatially Partitioned Attention Module in Deep Convolutional Neural Networks for Image Classification

被引:0
|
作者
Wang F. [1 ]
Qiao R. [1 ]
机构
[1] School of Information and Communication Engineering, Xi'an Jiaotong University, Xi'an
关键词
attention mechanisms; convolutional neural networks; feature extraction; image classification;
D O I
10.7652/xjtuxb202309019
中图分类号
学科分类号
摘要
Existing attention mechanisms often use fusion or compression to obtain the required information, but this leads to a large quantity of information lost in the spatial or channel dimension. In order to solve this problem, the Spatially Partitioned Attention Module (SPAM), a really effective and lightweight attention module that can help obtain attention without channel fusion or compression, was proposed in the paper. For the input intermediate feature map, the SPAM first adaptively used average pooling and maximum pooling features for feature extraction, replaced the point feature with the local block feature in space to reduce the amount of calculation and used the IN layer and depthwise convolution to capture global spatial attention. Meanwhile, the reconstruction of channel dimension information was directly completed by group convolu-lion. Finally, the interpolation operation was used to obtain overall attention and weight the input feature map. Notably, the SPAM can be easily embedded in various mainstream CNN architectures, and network performance can be significantly improved by increasing a few microparam-ctcrs and calculations. To demonstrate the effectiveness of the SPAM, numerous experiments were conducted on the ImageNet-lK, CIFAR-100, and Food-101 datasets, and the network's regions of interest were visualized using Grad-CAM. On the ImagcNct-lK, CIFAR-100, and Food-101 datasets, the SPAM improved the accuracy of the baseline network by up to about 1. 08%, 2. 46%, and 1. 09%, respectively. The results show that the performance of the network embedded with the SPAM components is greatly improved; compared to other commonly used lightweight attention mechanisms, the SPAM always works better; the SPAM can really induce the networks to pay more attention to the target object regions and accurately improve the expression ability of the networks. © 2023 Xi'an Jiaotong University. All rights reserved.
引用
收藏
页码:185 / 192
页数:7
相关论文
共 25 条
  • [1] GU Jmxiang, WANG Zhenhua, KUEN J, Et al., Recent advances in convolutional neural networks, Pattern Recognition, 77, pp. 354-377, (2018)
  • [2] LECUN Y, BOTTOU L, BENGIO Y, Et al., Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86, 11, pp. 2278-2324, (1998)
  • [3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E., ImageNet classification with deep convolutional neural networks, Communications of the ACM, 60, 6, pp. 84-90, (2017)
  • [4] LECUN Y, BENGIO Y, HINTON G, Deep learning, Nature, 521, 7553, pp. 436-444, (2015)
  • [5] ZHANG Chenjia, ZHU Lei, YU Lu, Review of atten-tion mechanism in convolutional neural networks [J], Computer Engineering and Applications, 57, 20, pp. 64-72, (2021)
  • [6] GUO Menghao, XU Tianxing, LIU Jiangjiang, Et al., Attention mechanisms in computer vision: a survey, Computational Visual Media, 8, 3, pp. 331-368, (2022)
  • [7] WANG Fei, JIANG Mengqmg, QIAN Chen, Et al., Residual attention network for image classification, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450-6458, (2017)
  • [8] ZHU Mmghao, JIAO Licheng, LIU Fang, Et al., Residual spectral-spatial attention network for hyperspec-tral image classification, IEEE Transactions on Geoscience and Remote Sensing, 59, 1, pp. 449-462, (2021)
  • [9] HUANG Zilong, WANG Xinggang, HUANG Lichao, Et al., CCNet: criss-cross attention for semantic segmentation, 2019 IEEE/CVF International Conference on Computer Vision, pp. 603-612, (2019)
  • [10] LIU Wenxiang, SHU Yuanzhong, TANG Xiaomin, Et al., Remote sensing image segmentation using dual attention mechanism Deeplabv3+ algorithm, Tropical Geography, 40, 2, pp. 303-313, (2020)