ELASTIC: Improving CNNs with Dynamic Scaling Policies

被引:44
作者
Wang, Huiyu [1 ]
Kembhavi, Aniruddha [2 ]
Farhadi, Ali [2 ,3 ,4 ]
Yuille, Alan [1 ]
Rastegari, Mohammad [2 ,4 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] PRIOR Allen Inst AI, Seattle, WA USA
[3] Univ Washington, Seattle, WA 98195 USA
[4] Xnor Ai, Seattle, WA USA
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.00236
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have a similar theme: a set of intuitive and manually designed policies that are generic and fixed (e.g. SIFT or feature pyramid). We argue that the scaling policy should be learned from data. In this paper, we introduce Elastic, a simple, efficient and yet very effective approach to learn a dynamic scale policy from data. We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture. We applied Elastic to several state-of-the-art network architectures and showed consistent improvement without extra (sometimes even lower) computation on ImageNet classification, MSCOCO multi-label classification, and PASCAL VOC semantic segmentation. Our results show major improvement for images with scale challenges.
引用
收藏
页码:2253 / 2262
页数:10
相关论文
共 42 条
[1]  
[Anonymous], 2016, CoRR abs/1512.00567, DOI DOI 10.1109/CVPR.2016.308
[2]  
[Anonymous], 2015, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2015.7298642
[3]  
[Anonymous], 2017, arXiv preprint arXiv:1706.05587, DOI DOI 10.48550/ARXIV.1706.05587
[4]   Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].
Bell, Sean ;
Zitnick, C. Lawrence ;
Bala, Kavita ;
Girshick, Ross .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883
[5]  
Berg A.C., 2015, ARXIV150604579
[6]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[7]  
Chen TS, 2018, AAAI CONF ARTIF INTE, P6730
[8]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645
[10]   Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based onWeakly Supervised Learning [J].
Ge, Weifeng ;
Yang, Sibei ;
Yu, Yizhou .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1277-1286