Multi-Scale and spatial position-based channel attention network for crowd counting

被引：6

作者：

Wang, Lin ^{[1
]}

Li, Jie ^{[1
]}

Zhang, Siqi ^{[2
]}

Qi, Chun ^{[1
]}

Wang, Pan ^{[1
]}

Wang, Fengping ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Informat & Commun Engn, Xian 710049, Peoples R China

[2] Xian Modern Control Technol Res Inst, Xian 710065, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2023年 / 90卷

基金：

中国国家自然科学基金;

关键词：

Crowd counting; Spatial position -based channel attention model; Multi -scale structure; Adaptive loss;

D O I：

10.1016/j.jvcir.2022.103718

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Crowd counting algorithms have recently incorporated attention mechanisms into convolutional neural networks (CNNs) to achieve significant progress. The channel attention model (CAM), as a popular attention mechanism, calculates a set of probability weights to select important channel-wise feature responses. However, most CAMs roughly assign a weight to the entire channel-wise map, which makes useful and useless information being treat indiscriminately, thereby limiting the representational capacity of networks. In this paper, we propose a multi -scale and spatial position-based channel attention network (MS-SPCANet), which integrates spatial position -based channel attention models (SPCAMs) with multiple scales into a CNN. SPCAM assigns different channel attention weights to different positions of channel-wise maps to capture more informative features. Furthermore, an adaptive loss, which uses adaptive coefficients to combine density map loss and headcount loss, is constructed to improve network performance in sparse crowd scenes. Experimental results on four public datasets verify the superiority of the scheme.

引用

页数：12

共 55 条

[1]

[Anonymous], 2015, P 3 INT C LEARN REPR

[2]

[Anonymous], 2017, 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2017, Lecce, Italy, August 29-September 1, 2017', IEEE, IEEE Computer Society

[3] Scale Aggregation Network for Accurate and Efficient Crowd Counting [J].

Cao, Xinkun ;

Wang, Zhipeng ;

Zhao, Yanyun ;

Su, Fei .

COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :757-773

[4] Bayesian Poisson Regression for Crowd Counting [J].

Chan, Antoni B. ;

Vasconcelos, Nuno .

2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :545-551

[5] Crowd counting with crowd attention convolutional neural network [J].

Chen, Jiwei ;

Su, Wen ;

Wang, Zengfu .

NEUROCOMPUTING, 2020, 382 :210-220

[6] Feature Mining for Localised Crowd Counting [J].

Chen, Ke ;

Loy, Chen Change ;

Gong, Shaogang ;

Xiang, Tao .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,

[7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[8] DISC: Deep Image Saliency Computing via Progressive Representation Learning [J].

Chen, Tianshui ;

Lin, Liang ;

Liu, Lingbo ;

Luo, Xiaonan ;

Li, Xuelong .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (06) :1135-1149

[9] Scale Pyramid Network for Crowd Counting [J].

Chen, Xinya ;

Bin, Yanrui ;

Sang, Nong ;

Gao, Changxin .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1941-1950

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 6 →