Pixel-wise Attentional Gating for Scene Parsing

被引:29
作者
Kong, Shu [1 ]
Fowlkes, Charless [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
来源
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2019年
关键词
D O I
10.1109/WACV.2019.00114
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
To achieve dynamic inference in pixel labeling tasks, we propose Pixel-wise Attentional Gating (PAG), which learns to selectively process a subset of spatial locations at each layer of a deep convolutional network. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily "plugged in" to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation (FLOPs) while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by 10% without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints.
引用
收藏
页码:1024 / 1033
页数:10
相关论文
共 58 条
  • [11] [Anonymous], 2014, ARXIV NEURAL EVOLUTI
  • [12] [Anonymous], 2017, P IEEE C COMP VIS PA
  • [13] [Anonymous], 2015, ACCELERATING DEEP CO
  • [14] [Anonymous], P IEEE C COMP VIS PA
  • [15] [Anonymous], P IEEE C COMP VIS PA
  • [16] [Anonymous], 2017, INT C LEARN REPR ICL
  • [17] [Anonymous], P IEEE C COMP VIS PA
  • [18] [Anonymous], 2016, EUR C COMP VIS ECCV
  • [19] [Anonymous], 2017, IEEE T PATTERN ANAL
  • [20] Multiscale Combinatorial Grouping
    Arbelaez, Pablo
    Pont-Tuset, Jordi
    Barron, Jonathan T.
    Marques, Ferran
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 328 - 335