Crowd Counting by Using Top-k Relations: A Mixed Ground-Truth CNN Framework

被引:43
作者
Dong, Li [1 ]
Zhang, Haijun [1 ]
Yang, Kai [1 ]
Zhou, Dongliang [1 ]
Shi, Jianyang [1 ]
Ma, Jianghong [1 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd counting; mixed ground truth; self-attention; top-k relations;
D O I
10.1109/TCE.2022.3190384
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Crowd counting has important applications in the environments of smart cities, such as intelligent surveillance. In this paper, we propose a novel convolutional neural network (CNN) framework for crowd counting with mixed ground-truth, called top-k relation-based network (TKRNet). Specifically, the estimated density maps generated in a coarse-to-fine manner are treated as coarse locations for crowds so as to assist our TKRNet to regress the scattered point-annotated ground truth. Moreover, an adaptive top-k relation module (ATRM) is proposed to enhance feature representations by leveraging the top-k dependencies between the pixels with an adaptive filtering mechanism. Specifically, we first compute the similarity between two pixels so as to select the top-k relations for each position. Then, a weight normalization operation with an adaptive filtering mechanism is proposed to make the ATRM adaptively eliminate the influence from the low correlation positions in the top-k relations. Finally, a weight attention mechanism is introduced to make the ATRM pay more attention to the positions with high weights in the top-k relations. Extensive experimental results demonstrate the effectiveness of our proposed TKRNet on several public datasets in comparison to state-of-the-art methods.
引用
收藏
页码:307 / 316
页数:10
相关论文
共 45 条
[1]  
[Anonymous], 2015, ICLR
[2]   CrowdNet: A Deep Convolutional Network for Dense Crowd Counting [J].
Boominathan, Lokesh ;
Kruthiventi, Srinivas S. S. ;
Babu, R. Venkatesh .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :640-644
[3]   Scale Aggregation Network for Accurate and Efficient Crowd Counting [J].
Cao, Xinkun ;
Wang, Zhipeng ;
Zhao, Yanyun ;
Su, Fei .
COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :757-773
[4]   Real-Time Speech Emotion Analysis for Smart Home Assistants [J].
Chatterjee, Rajdeep ;
Mazumdar, Saptarshi ;
Sherratt, R. Simon ;
Halder, Rohit ;
Maitra, Tanmoy ;
Giri, Debasis .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2021, 67 (01) :68-76
[5]   Robust People Counting System Based on Sensor Fusion [J].
Dan, Byoung-Kyu ;
Kim, You-Sun ;
Suryanto ;
Jung, June-Young ;
Ko, Sung-Jea .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2012, 58 (03) :1013-1021
[6]   An Efficient Multiple Object Detection and Tracking Framework for Automatic Counting and Video Surveillance Applications [J].
del-Blanco, Carlos R. ;
Jaureguizar, Fernando ;
Garcia, Narciso .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2012, 58 (03) :857-862
[7]   Crowd counting by using multi-level density-based spatial information: A Multi-scale CNN framework [J].
Dong, Li ;
Zhang, Haijun ;
Ji, Yuzhu ;
Ding, Yuxin .
INFORMATION SCIENCES, 2020, 528 :79-91
[8]   Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds [J].
Idrees, Haroon ;
Tayyab, Muhmmad ;
Athrey, Kishan ;
Zhang, Dong ;
Al-Maadeed, Somaya ;
Rajpoot, Nasir ;
Shah, Mubarak .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :544-559
[9]   Attention Scaling for Crowd Counting [J].
Jiang, Xiaoheng ;
Zhang, Li ;
Xu, Mingliang ;
Zhang, Tianzhu ;
Lv, Pei ;
Zhou, Bing ;
Yang, Xin ;
Pang, Yanwei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4705-4714
[10]   Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks [J].
Jiang, Xiaolong ;
Xiao, Zehao ;
Zhang, Baochang ;
Zhen, Xiantong ;
Cao, Xianbin ;
Doermann, David ;
Shao, Ling .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6126-6135