Non-local Neural Networks

被引:5698
作者
Wang, Xiaolong [1 ,2 ]
Girshick, Ross [2 ]
Gupta, Abhinav [1 ]
He, Kaiming [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Facebook AI Res, Menlo Pk, CA 94025 USA
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
关键词
D O I
10.1109/CVPR.2018.00813
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method [4] in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our non-local models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code will be made available.
引用
收藏
页码:7794 / 7803
页数:10
相关论文
共 56 条
  • [1] [Anonymous], INT C AC SPEECH SIGN
  • [2] [Anonymous], 2012, Technical Report
  • [3] [Anonymous], 2016, Lecture Notes in Computer Science, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38]
  • [4] [Anonymous], 2001, PROC 18 INT C MACH L
  • [5] [Anonymous], 2017, NIPS
  • [6] [Anonymous], 2017, ARXIV170803805
  • [7] [Anonymous], 2016, Advances in Neural Information Processing Systems
  • [8] Barnes C., 2009, P SIGGRAPH ACM T GRA
  • [9] Battaglia P., 2016, NeurIPS, P4502
  • [10] A non-local algorithm for image denoising
    Buades, A
    Coll, B
    Morel, JM
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2005, : 60 - 65