Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization

被引：142

作者：

Lian, Dongze ^{[1
]}

Li, Jing ^{[1
]}

Zheng, Jia ^{[1
]}

Luo, Weixin ^{[1
,2
]}

Gao, Shenghua ^{[1
]}

机构：

[1] ShanghaiTech Univ, Shanghai, Peoples R China

[2] Yoke Intelligence, Copenhagen, Denmark

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00192

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To simultaneously estimate head counts and localize heads with bounding boxes, a regression guided detection network (RDNet) is proposed for RGB-D crowd counting. Specifically, to improve the robustness of detection-based approaches for small/tiny heads, we leverage density map to improve the head/non-head classification in detection network where density map serves as the probability of a pixel being a head. A depth-adaptive kernel that considers the variances in head sizes is also introduced to generate high-fidelity density map for more robust density map regression. Further, a depth-aware anchor is designed for better initialization of anchor sizes in detection framework. Then we use the bounding boxes whose sizes are estimated with depth to train our RDNet. The existing RGB-D datasets are too small and not suitable for performance evaluation on data-driven based approaches, we collect a large-scale RGB-D crowd counting dataset. Experiments on both our RGB-D dataset and the MICC RGB-D counting dataset show that our method achieves the best performance for RGB-D crowd counting and localization. Further, our method can be readily extended to RGB image based crowd counting and achieves comparable performance on the Shang-haiTech Part_B dataset for both counting and localization.

引用

页码：1821 / 1830

页数：10

共 37 条

[1] [Anonymous], 2008, 2008 19 INT C PATTER, DOI DOI 10.1109/ICPR.2008.4761705
[2] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.106
[3] [Anonymous], PATTERN RECOGNITION
[4] Bondi E, 2014, 2014 11TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), P337, DOI 10.1109/AVSS.2014.6918691
[5] Cao Xinkun, 2018, EUR C COMP VIS ECCV
[6] Bayesian Poisson Regression for Crowd Counting
Chan, Antoni B.
Vasconcelos, Nuno
[J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 545 - 551
[7] Fu HY, 2012, IEEE IMAGE PROC, P2685, DOI 10.1109/ICIP.2012.6467452
[8] Learning Rich Features from RGB-D Images for Object Detection and Segmentation
Gupta, Saurabh
Girshick, Ross
Arbelaez, Pablo
Malik, Jitendra
[J]. COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 : 345 - 360
[9] Putting objects in perspective
Hoiem, Derek
Efros, Alexei A.
Hebert, Martial
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 80 (01) : 3 - 15
[10] Idrees H., 2018, EUR C COMP VIS ECCV

← 1 2 3 4 →