Attention Augmented Convolutional Networks

被引:824
作者
Bello, Irwan [1 ]
Zoph, Barret [1 ]
Vaswani, Ashish [1 ]
Shlens, Jonathon [1 ]
Le, Quoc V. [1 ]
机构
[1] Google Brain, Mountain View, CA 94043 USA
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
ARCHITECTURES;
D O I
10.1109/ICCV.2019.00338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional networks have been the paradigm of choice in many computer vision applications. The convolution operation however has a significant weakness in that it only operates on a local neighborhood, thus missing global information. Self-attention, on the other hand, has emerged as a recent advance to capture long range interactions, but has mostly been applied to sequence modeling and generative modeling tasks. In this paper, we consider the use of self-attention for discriminative visual tasks as an alternative to convolutions. We introduce a novel two-dimensional relative self-attention mechanism that proves competitive in replacing convolutions as a stand-alone computational primitive for image classification. We find in control experiments that the best results are obtained when combining both convolutions and self-attention. We therefore propose to augment convolutional operators with this self-attention mechanism by concatenating convolutional feature maps with a set of feature maps produced via self-attention. Extensive experiments show that Attention Augmentation leads to consistent improvements in image classification on ImageNet and object detection on COCO across many different models and scales, including ResNets and a stateof-the art mobile constrained network, while keeping the number of parameters similar. In particular, our method achieves a 1.3% top-1 accuracy improvement on ImageNet classification over a ResNet50 baseline and outperforms other attention mechanisms for images such as Squeeze-and-Excitation [17]. It also achieves an improvement of 1.4 mAP in COCO Object Detection on top of a RetinaNet baseline.
引用
收藏
页码:3285 / 3294
页数:10
相关论文
共 55 条
  • [11] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.90
  • [12] [Anonymous], 2017, CORR
  • [13] [Anonymous], 2016, EUR C COMP VIS
  • [14] [Anonymous], 2016, P BRIT MACHINE VISIO
  • [15] [Anonymous], 2012, NIPS 1106-1114
  • [16] [Anonymous], 2018, INT C MACH LEARN
  • [17] [Anonymous], 2018, CORR
  • [18] [Anonymous], 2020, P IEEE C COMP VIS PA
  • [19] [Anonymous], 2015, 3 INT C LEARN REPR I
  • [20] Bello Irwan, 2018, CoRR abs/1810.02019