Bottom-up Object Detection by Grouping Extreme and Center Points

被引：754

作者：

Zhou, Xingyi ^{[1
]}

Zhuo, Jiacheng ^{[1
]}

Krahenbuhl, Philipp ^{[1
]}

机构：

[1] UT Austin, Austin, TX 78712 USA

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00094

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. We group the five keypoints into a bounding box if they are geometrically aligned. Object detection is then a purely appearance-based keypoint estimation problem, without region classification or implicit feature learning. The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.7% on COCO test-dev. In addition, our estimated extreme points directly span a coarse octagonal mask, with a COCO Mask AP of 18.9%, much better than the Mask AP of vanilla bounding boxes. Extreme point guided segmentation further improves this to 34.6% Mask AP.

引用

页码：850 / 859

页数：10

共 52 条

[1]

[Anonymous], 2014, ECCV

[2]

[Anonymous], 2017, ARXIV170310295

[3] Soft-NMS - Improving Object Detection With One Line of Code [J].

Bodla, Navaneeth ;

Singh, Bharat ;

Chellappa, Rama ;

Davis, Larry S. .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570

[4] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[6] Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning [J].

Chen, Yuhua ;

Pont-Tuset, Jordi ;

Montes, Alberto ;

Van Gool, Luc .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1189-1198

[7]

Dai J., 2016, ADV NEURAL INFORM PR, P379, DOI DOI 10.1109/CVPR.2017.690

[8] Deformable Convolutional Networks [J].

Dai, Jifeng ;

Qi, Haozhi ;

Xiong, Yuwen ;

Li, Yi ;

Zhang, Guodong ;

Hu, Han ;

Wei, Yichen .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773

[9]

Everingham M., The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results

[10]

Felzenszwalb P. F., 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence, V32, P1627, DOI [10.1109/TPAMI.2009.167, DOI 10.1109/TPAMI.2009.167]

← 1 2 3 4 5 6 →