Region-aware network: Model human's Top-Down visual perception mechanism for crowd counting

被引:17
作者
Chen, Yuehai [1 ]
Yang, Jing [1 ,2 ]
Zhang, Dong [1 ]
Zhang, Kun [1 ]
Chen, Badong [2 ]
Du, Shaoyi [2 ]
机构
[1] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Automat Sci & Engn, Xian 710049, Shanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Coll Artificial Intelligence, Inst Artificial Intelligence & Robot, Xian 710049, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd counting; Top-Down visual perception mechanism; Priority map; Global context information;
D O I
10.1016/j.neunet.2022.01.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background noise and scale variation are common problems that have been long recognized in crowd counting. Humans glance at a crowd image and instantly know the approximate number of human and where they are through attention the crowd regions and the congestion degree of crowd regions with a global receptive field. Hence, in this paper, we propose a novel feedback network with Region-Aware block called RANet by modeling human's Top-Down visual perception mechanism. Firstly, we introduce a feedback architecture to generate priority maps that provide prior about candidate crowd regions in input images. The prior enables the RANet pay more attention to crowd regions. Then we design Region-Aware block that could adaptively encode the contextual information into input images through global receptive field. More specifically, we scan the whole input images and its priority maps in the form of column vector to obtain a relevance matrix estimating their similarity. The relevance matrix obtained would be utilized to build global relationships between pixels. Our method outperforms state-of-the-art crowd counting methods on several public datasets. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.
引用
收藏
页码:219 / 231
页数:13
相关论文
共 47 条
  • [1] Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction
    Ali, Ahmad
    Zhu, Yanmin
    Zakarya, Muhammad
    [J]. NEURAL NETWORKS, 2022, 145 : 233 - 247
  • [2] Bansal A., 2015, ARXIV150708445
  • [3] CrowdNet: A Deep Convolutional Network for Dense Crowd Counting
    Boominathan, Lokesh
    Kruthiventi, Srinivas S. S.
    Babu, R. Venkatesh
    [J]. MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 640 - 644
  • [4] Cao X, P EUR C COMP VIS ECC, P734
  • [5] Bayesian Poisson Regression for Crowd Counting
    Chan, Antoni B.
    Vasconcelos, Nuno
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 545 - 551
  • [6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [7] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [8] Pedestrian Detection: An Evaluation of the State of the Art
    Dollar, Piotr
    Wojek, Christian
    Schiele, Bernt
    Perona, Pietro
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (04) : 743 - 761
  • [9] Fiaschi L, 2012, INT C PATT RECOG, P2685
  • [10] PCC Net: Perspective Crowd Counting via Spatial Convolutional Network
    Gao, Junyu
    Wang, Qi
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) : 3486 - 3498