Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization

被引:142
作者
Lian, Dongze [1 ]
Li, Jing [1 ]
Zheng, Jia [1 ]
Luo, Weixin [1 ,2 ]
Gao, Shenghua [1 ]
机构
[1] ShanghaiTech Univ, Shanghai, Peoples R China
[2] Yoke Intelligence, Copenhagen, Denmark
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.00192
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To simultaneously estimate head counts and localize heads with bounding boxes, a regression guided detection network (RDNet) is proposed for RGB-D crowd counting. Specifically, to improve the robustness of detection-based approaches for small/tiny heads, we leverage density map to improve the head/non-head classification in detection network where density map serves as the probability of a pixel being a head. A depth-adaptive kernel that considers the variances in head sizes is also introduced to generate high-fidelity density map for more robust density map regression. Further, a depth-aware anchor is designed for better initialization of anchor sizes in detection framework. Then we use the bounding boxes whose sizes are estimated with depth to train our RDNet. The existing RGB-D datasets are too small and not suitable for performance evaluation on data-driven based approaches, we collect a large-scale RGB-D crowd counting dataset. Experiments on both our RGB-D dataset and the MICC RGB-D counting dataset show that our method achieves the best performance for RGB-D crowd counting and localization. Further, our method can be readily extended to RGB image based crowd counting and achieves comparable performance on the Shang-haiTech Part_B dataset for both counting and localization.
引用
收藏
页码:1821 / 1830
页数:10
相关论文
共 37 条
  • [1] [Anonymous], 2008, 2008 19 INT C PATTER, DOI DOI 10.1109/ICPR.2008.4761705
  • [2] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.106
  • [3] [Anonymous], PATTERN RECOGNITION
  • [4] Bondi E, 2014, 2014 11TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), P337, DOI 10.1109/AVSS.2014.6918691
  • [5] Cao Xinkun, 2018, EUR C COMP VIS ECCV
  • [6] Bayesian Poisson Regression for Crowd Counting
    Chan, Antoni B.
    Vasconcelos, Nuno
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 545 - 551
  • [7] Fu HY, 2012, IEEE IMAGE PROC, P2685, DOI 10.1109/ICIP.2012.6467452
  • [8] Learning Rich Features from RGB-D Images for Object Detection and Segmentation
    Gupta, Saurabh
    Girshick, Ross
    Arbelaez, Pablo
    Malik, Jitendra
    [J]. COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 : 345 - 360
  • [9] Putting objects in perspective
    Hoiem, Derek
    Efros, Alexei A.
    Hebert, Martial
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 80 (01) : 3 - 15
  • [10] Idrees H., 2018, EUR C COMP VIS ECCV