Attend to count: Crowd counting with adaptive capacity multi-scale CNNs

被引:45
作者
Zou, Zhikang [1 ]
Cheng, Yu [2 ]
Qu, Xiaoye [1 ]
Ji, Shouling [3 ]
Guo, Xiaoxiao [4 ]
Zhou, Pan [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan, Hubei, Peoples R China
[2] Microsoft Res & AI, Beijing, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Zhejiang, Peoples R China
[4] IBM & AI Fdn Learning, Beijing, Peoples R China
关键词
Crowd counting; Attention mechanism; Multi-scale CNNs; Adaptive capacity;
D O I
10.1016/j.neucom.2019.08.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowd counting is a challenging task due to the large variations in crowd distributions. Previous methods tend to tackle the whole image with a single fixed structure, which is unable to handle diverse complicated scenes with different crowd densities. Hence, we propose the Adaptive Capacity Multi-scale convolutional neural networks (ACM-CNN), a novel crowd counting approach which can assign different capacities to different portions of the input. The intuition is that the model should focus on important regions of the input image and optimize its capacity allocation conditioning on the crowd intensive degree. ACM-CNN consists of three types of modules: A coarse network, a fine network, and a smooth network. The coarse network is used to explore the areas that need to be focused via count attention mechanism, and generate a rough feature map. Then the fine network processes the areas of interest into a fine feature map. To alleviate the sense of division caused by fusion, the smooth network is designed to combine two feature maps organically to produce high-quality density maps. Extensive experiments are conducted on five mainstream datasets. The results demonstrate the effectiveness of the proposed model for both density estimation and crowd counting tasks. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:75 / 83
页数:9
相关论文
共 40 条
  • [11] Dynamic attention priors: a new and efficient concept for improving object detection
    Gepperth, Alexander R. T.
    Ortiz, Michael Garcia
    Sattarov, Egor
    Heisele, Bernd
    [J]. NEUROCOMPUTING, 2016, 197 : 14 - 28
  • [12] Crowd Counting Using Scale-Aware Attention Networks
    Hossain, Mohammad Asiful
    Hosseinzadeh, Mehrdad
    Chanda, Omit
    Wang, Yang
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1280 - 1288
  • [13] Multi-Source Multi-Scale Counting in Extremely Dense Crowd Images
    Idrees, Haroon
    Saleemi, Imran
    Seibert, Cody
    Shah, Mubarak
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2547 - 2554
  • [14] Salient object detection via multi-scale attention CNN
    Ji, Yuzhu
    Zhang, Haijun
    Wu, Q. M. Jonathan
    [J]. NEUROCOMPUTING, 2018, 322 : 130 - 140
  • [15] Jiang X., 2019, ABS190300853 CORR
  • [16] Detecting and counting people using real-time directional algorithms implemented by compute unified device architecture
    Kocak, Yasemin Poyraz
    Sevgen, Selcuk
    [J]. NEUROCOMPUTING, 2017, 248 : 105 - 111
  • [17] DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations
    Kruthiventi, Srinivas S. S.
    Ayush, Kumar
    Babu, R. Venkatesh
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (09) : 4446 - 4456
  • [18] Kumagai S., 2017, ABS170309393 CORR
  • [19] Visual Question Generation as Dual Task of Visual Question Answering
    Li, Yikang
    Duan, Nan
    Zhou, Bolei
    Chu, Xiao
    Ouyang, Wanli
    Wang, Xiaogang
    Zhou, Ming
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6116 - 6124
  • [20] Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
    Liu, Jiaying
    Yang, Wenhan
    Yang, Shuai
    Guo, Zongming
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3233 - 3242