Feature Selective Anchor-Free Module for Single-Shot Object Detection

被引：715

作者：

Zhu, Chenchen ^{[1
]}

He, Yihui ^{[1
]}

Savvides, Marios ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00093

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We motivate and present feature selective anchor-free (FSAF) module, a simple and effective building block for single-shot object detectors. It can be plugged into single-shot detectors with feature pyramid structure. The FSAF module addresses two limitations brought up by the conventional anchor-based detection: 1) heuristic-guided feature selection; 2) overlap-based anchor sampling. The general concept of the FSAF module is online feature selection applied to the training of multi-level anchor-free branches. Specifically, an anchor-free branch is attached to each level of the feature pyramid, allowing box encoding and decoding in the anchor-free manner at an arbitrary level. During training, we dynamically assign each instance to the most suitable feature level. At the time of inference, the FSAF module can work independently or jointly with anchor-based branches. We instantiate this concept with simple implementations of anchor-free branches and online feature selection strategy. Experimental results on the COCO detection track show that our FSAF module performs better than anchor-based counterparts while being faster. When working jointly with anchor-based branches, the FSAF module robustly improves the baseline RetinaNet by a large margin under various settings, while introducing nearly free inference overhead. And the resulting best model can achieve a state-of-the-art 44.6% mAP, outperforming all existing single-shot detectors on COCO.

引用

页码：840 / 849

页数：10

共 41 条

[1] [Anonymous], 2018, ARXIV180406559
[2] [Anonymous], 2015, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2015.7298642
[3] [Anonymous], 2017, ARXIV171202408
[4] [Anonymous], 2018, ARXIV180409003
[5] [Anonymous], 2017, P IEEE C COMP VIS PA
[6] Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses
Bhagavatula, Chandrasekhar
Zhu, Chenchen
Luu, Khoa
Savvides, Marios
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4000 - 4009
[7] Soft-NMS - Improving Object Detection With One Line of Code
Bodla, Navaneeth
Singh, Bharat
Chellappa, Rama
Davis, Larry S.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5562 - 5570
[8] Fast Keyframe Selection and Switching for ICP-based Camera Pose Estimation
Chen, Chun-Wei
Hsiao, Wen-Yuan
Lin, Ting-Yu
Wang, Jonas
Shieh, Ming-Der
[J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
[9] Deformable Convolutional Networks
Dai, Jifeng
Qi, Haozhi
Xiong, Yuwen
Li, Yi
Zhang, Guodong
Hu, Han
Wei, Yichen
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773
[10] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 →