Delving deep into spatial pooling for squeeze-and-excitation networks

被引:80
作者
Jin, Xin [4 ]
Xie, Yanping [1 ,2 ,5 ]
Wei, Xiu-Shen [1 ,2 ,3 ]
Zhao, Bo-Rui [4 ]
Chen, Zhao-Min [4 ]
Tan, Xiaoyang [5 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Key Lab Intelligent Percept & Syst High Dimens In, PCA Lab,Minist Educ, Nanjing, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Jiangsu Key Lab Image & Video Understanding Socia, Nanjing, Peoples R China
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
[4] Megvii Technol, Megvii Res Nanjing, Nanjing, Peoples R China
[5] Nanjing Univ Aeronaut & Astronaut, Dept Comp Sci & Technol, Nanjing, Peoples R China
关键词
Convolutional neural networks; Squeeze-and-excitation; Spatial pooling; Base model; IMAGE; CLASSIFICATION; ATTENTION;
D O I
10.1016/j.patcog.2021.108159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Squeeze-and-Excitation (SE) blocks have demonstrated significant accuracy gains for state-of-the-art deep architectures by re-weighting channel-wise feature responses. The SE block is an architecture unit that integrates two operations: a squeeze operation that employs global average pooling to aggregate spatial convolutional features into a channel feature, and an excitation operation that learns instance-specific channel weights from the squeezed feature to re-weight each channel. In this paper, we revisit the squeeze operation in SE blocks, and shed lights on why and how to embed rich (both global and lo -cal ) information into the excitation module at minimal extra costs. In particular, we introduce a simple but effective two-stage spatial pooling process: rich descriptor extraction and information fusion . The rich descriptor extraction step aims to obtain a set of diverse (i.e., global and especially local) deep descrip-tors that contain more informative cues than global average-pooling. While, absorbing more information delivered by these descriptors via a fusion step can aid the excitation operation to return more accu-rate re-weight scores in a data-driven manner. We validate the effectiveness of our method by extensive experiments on ImageNet for image classification and on MS-COCO for object detection and instance seg-mentation. For these experiments, our method achieves consistent improvements over the SENets on all tasks, in some cases, by a large margin. (c) 2021 Published by Elsevier Ltd.
引用
收藏
页数:12
相关论文
共 54 条
  • [1] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
    Aimar, Alessandro
    Mostafa, Hesham
    Calabrese, Enrico
    Rios-Navarro, Antonio
    Tapiador-Morales, Ricardo
    Lungu, Iulia-Alexandra
    Milde, Moritz B.
    Corradi, Federico
    Linares-Barranco, Alejandro
    Liu, Shih-Chii
    Delbruck, Tobi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
  • [2] Albanie Samuel, 2018, Advances in neural information processing systems, P9401
  • [3] Mask R-CNN
    He, Kaiming
    Gkioxari, Georgia
    Dollar, Piotr
    Girshick, Ross
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2980 - 2988
  • [4] Network Dissection: Quantifying Interpretability of Deep Visual Representations
    Bau, David
    Zhou, Bolei
    Khosla, Aditya
    Oliva, Aude
    Torralba, Antonio
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3319 - 3327
  • [5] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [6] Bluche T, 2016, ADV NEUR IN, V29
  • [7] Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks
    Cao, Chunshui
    Liu, Xianming
    Yang, Yi
    Yu, Yinan
    Wang, Jiang
    Wang, Zilei
    Huang, Yongzhen
    Wang, Liang
    Huang, Chang
    Xu, Wei
    Ramanan, Deva
    Huang, Thomas S.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2956 - 2964
  • [8] Quantized CNN: A Unified Approach to Accelerate and Compress Convolutional Networks
    Cheng, Jian
    Wu, Jiaxiang
    Leng, Cong
    Wang, Yuhang
    Hu, Qinghao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (10) : 4730 - 4743
  • [9] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [10] Low-Complexity Approximate Convolutional Neural Networks
    Cintra, Renato J.
    Duffner, Stefan
    Garcia, Christophe
    Leite, Andre
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5981 - 5992