Attend to count: Crowd counting with adaptive capacity multi-scale CNNs

被引:45
作者
Zou, Zhikang [1 ]
Cheng, Yu [2 ]
Qu, Xiaoye [1 ]
Ji, Shouling [3 ]
Guo, Xiaoxiao [4 ]
Zhou, Pan [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan, Hubei, Peoples R China
[2] Microsoft Res & AI, Beijing, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[4] IBM & AI Fdn Learning, Beijing, Peoples R China
关键词
Crowd counting; Attention mechanism; Multi-scale CNNs; Adaptive capacity;
D O I
10.1016/j.neucom.2019.08.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowd counting is a challenging task due to the large variations in crowd distributions. Previous methods tend to tackle the whole image with a single fixed structure, which is unable to handle diverse complicated scenes with different crowd densities. Hence, we propose the Adaptive Capacity Multi-scale convolutional neural networks (ACM-CNN), a novel crowd counting approach which can assign different capacities to different portions of the input. The intuition is that the model should focus on important regions of the input image and optimize its capacity allocation conditioning on the crowd intensive degree. ACM-CNN consists of three types of modules: A coarse network, a fine network, and a smooth network. The coarse network is used to explore the areas that need to be focused via count attention mechanism, and generate a rough feature map. Then the fine network processes the areas of interest into a fine feature map. To alleviate the sense of division caused by fusion, the smooth network is designed to combine two feature maps organically to produce high-quality density maps. Extensive experiments are conducted on five mainstream datasets. The results demonstrate the effectiveness of the proposed model for both density estimation and crowd counting tasks. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:75 / 83
页数:9
相关论文
共 40 条
  • [1] Almahairi A., 2015, ABS151107838 CORR
  • [2] [Anonymous], 2018, CORR
  • [3] [Anonymous], 2018, ABS181111968 CORR
  • [4] Ba J., 2014, Multiple object recognition with visual attention
  • [5] CrowdNet: A Deep Convolutional Network for Dense Crowd Counting
    Boominathan, Lokesh
    Kruthiventi, Srinivas S. S.
    Babu, R. Venkatesh
    [J]. MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 640 - 644
  • [6] Cao Xinkun, 2018, EUR C COMP VIS ECCV
  • [7] Privacy preserving crowd monitoring: Counting people without people models or tracking
    Chan, Antoni B.
    Liang, Zhang-Sheng John
    Vasconcelos, Nuno
    [J]. 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 1766 - 1772
  • [8] Bayesian Poisson Regression for Crowd Counting
    Chan, Antoni B.
    Vasconcelos, Nuno
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 545 - 551
  • [9] Feature Mining for Localised Crowd Counting
    Chen, Ke
    Loy, Chen Change
    Gong, Shaogang
    Xiang, Tao
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [10] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848