Enhanced mechanisms of pooling and channel attention for deep learning feature maps

被引:2
|
作者
Li, Hengyi [1 ]
Yue, Xuebin [1 ]
Meng, Lin [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Sci & Engn, Kusatsu, Shiga, Japan
[2] Ritsumeikan Univ, Coll Sci & Engn, Kusatsu, Shiga, Japan
关键词
DNNs; Max pooling; Average pooling; FMAPooling; Self-attention; FMAttn;
D O I
10.7717/peerj-cs.1161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the computer vision attention mechanism. However, as a matter of fact, pooling is a down-sampling operation, which makes the feature-map representation approximately to small translations with the summary statistic of adjacent pixels. As a result, the function inevitably leads to information loss more or less. In this article, we propose a fused max-average pooling (FMAPooling) operation as well as an improved channel attention mechanism (FMAttn) by utilizing the two pooling functions to enhance the feature representation for DNNs. Basically, the methods are to enhance multiple-level features extracted by max pooling and average pooling respectively. The effectiveness of the proposals is verified with VGG, ResNet, and MobileNetV2 architectures on CIFAR10/100 and ImageNet100. According to the experimental results, the FMAPooling brings up to 1.63% accuracy improvement compared with the baseline model; the FMAttn achieves up to 2.21% accuracy improvement compared with the previous channel attention mechanism. Furthermore, the proposals are extensible and could be embedded into various DNN models easily, or take the place of certain structures of DNNs. The computation burden introduced by the proposals is negligible.
引用
收藏
页数:18
相关论文
共 50 条
  • [11] DetailPoint: detailed feature learning on point clouds with attention mechanism
    Ying Li
    Jincheng Bai
    Huankun Sheng
    Machine Vision and Applications, 2024, 35
  • [12] Facial Ethnicity Classification by Efficiently Learning the Spatial Resolution of Feature Maps
    Waris, Fazal
    Da, Feipeng
    9TH INTERNATIONAL CONFERENCE ON MECHATRONICS ENGINEERING, ICOM 2024, 2024, : 345 - 350
  • [13] Self-Attention-Based Multiscale Feature Learning Optical Flow With Occlusion Feature Map Prediction
    Zhang, Congxuan
    Zhou, Zhongkai
    Chen, Zhen
    Hu, Weiming
    Li, Ming
    Jiang, Shaofeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3340 - 3354
  • [14] Self-Attention-Based Deep Feature Fusion for Remote Sensing Scene Classification
    Cao, Ran
    Fang, Leyuan
    Lu, Ting
    He, Nanjun
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (01) : 43 - 47
  • [15] Multi-level feature learning with attention for person re-identification
    Suncheng Xiang
    Yuzhuo Fu
    Hao Chen
    Wei Ran
    Ting Liu
    Multimedia Tools and Applications, 2020, 79 : 32079 - 32093
  • [16] Discriminant Feature Learning with Self-attention for Person Re-identification
    Li, Yang
    Jiang, Xiaoyan
    Hwang, Jenq-Neng
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 11 - 19
  • [17] Multi-level feature learning with attention for person re-identification
    Xiang, Suncheng
    Fu, Yuzhuo
    Chen, Hao
    Ran, Wei
    Liu, Ting
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32079 - 32093
  • [18] Detecting Unknown Network Attacks with Attention Encoding and Deep Metric Learning
    Fu, Chunlan
    Han, Shirong
    Shen, Gang
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 283 - 290
  • [19] CAG-FPN: CHANNEL SELF-ATTENTION GUIDED FEATURE PYRAMID NETWORK FOR OBJECT DETECTION
    Chang, Jie
    Dai, Huhe
    Zheng, Yuan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 9616 - 9620
  • [20] An Enhanced Deep Learning Method for Skin Cancer Detection and Classification
    El-Soud, Mohamed W. Abo
    Gaber, Tarek
    Tahoun, Mohamed
    Alourani, Abdullah
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (01): : 1109 - 1123