Multi-scale Attention Recalibration Network for crowd counting

被引：9

作者：

Xie, Jinyang ^{[1
]}

Pang, Chen ^{[1
]}

Zheng, Yanjun ^{[3
]}

Li, Liang ^{[1
,2
]}

Lyu, Chen ^{[1
,2
]}

Lyu, Lei ^{[1
,2
]}

Liu, Hong ^{[1
,2
]}

机构：

[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250358, Peoples R China

[2] Shandong Prov Key Lab Distributed Comp Software N, Jinan 250358, Peoples R China

[3] Shandong Big Data Ctr, Jinan 250011, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2022年 / 117卷

基金：

中国国家自然科学基金;

关键词：

Crowd counting; Attention module; Deep learning; Multi-scale feature;

D O I：

10.1016/j.asoc.2022.108457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Crowd counting using deep convolutional neural networks (CNN) has achieved encouraging progress in recent years. Nevertheless, how to efficiently address the problems of scale variation and complex backgrounds remain a major challenge. For this, we present an innovative Multi-scale Attention Recalibration Network termed MARNet for obtaining more accurate crowd counting. This is achieved mainly by introducing and integrating two significant modules into the proposed model. More concretely, a Feature Pyramid Module (FPM) is first designed to achieve multi-scale feature enhancement by utilizing multiple dilated convolutions with different rates, thus providing rich contextual information for subsequent operations. Besides, to adequately take advantage of these contextual information, a Feature Recalibration Module (FRM) is devised by integrating a Dimension Attention (DA) block with a Region Recalibration (RR) block. The DA block is mainly used for modeling the semantic dependencies between different dimensions of contextual information, while the RR block is responsible for reassigning attention weights for different regions based on the semantic dependencies. By the integration of the above two blocks, the proposed method can be targeted to capture the crowd features for accurately estimating crowd density. Extensive experiments on multiple publicly crowd counting datasets well demonstrate that our method significantly outperforms most existing methods in terms of the counting accuracy and the quality of the generated density map. (c) 2022 Elsevier B.V. All rights reserved.

引用

页数：11

共 31 条

[1] Adaptive Dilated Network with Self-Correction Supervision for Counting [J].

Bai, Shuai ;

He, Zhiqun ;

Qiao, Yu ;

Hu, Hanzhe ;

Wu, Wei ;

Yan, Junjie .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4593-4602

[2] Scale Aggregation Network for Accurate and Efficient Crowd Counting [J].

Cao, Xinkun ;

Wang, Zhipeng ;

Zhao, Yanyun ;

Su, Fei .

COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :757-773

[3] Crowd counting with crowd attention convolutional neural network [J].

Chen, Jiwei ;

Su, Wen ;

Wang, Zengfu .

NEUROCOMPUTING, 2020, 382 :210-220

[4] Learning Spatial Awareness to Improve Crowd Counting [J].

Cheng, Zhi-Qi ;

Li, Jun-Xiu ;

Dai, Qi ;

Wu, Xiao ;

Hauptmann, Alexander G. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6151-6160

[5]

Gao G., 2020, CNN-based density estimation and crowd counting: A survey

[6] PCC Net: Perspective Crowd Counting via Spatial Convolutional Network [J].

Gao, Junyu ;

Wang, Qi ;

Li, Xuelong .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) :3486-3498

[7] Attention Scaling for Crowd Counting [J].

Jiang, Xiaoheng ;

Zhang, Li ;

Xu, Mingliang ;

Zhang, Tianzhu ;

Lv, Pei ;

Zhou, Bing ;

Yang, Xin ;

Pang, Yanwei .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4705-4714

[8] Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks [J].

Jiang, Xiaolong ;

Xiao, Zehao ;

Zhang, Baochang ;

Zhen, Xiantong ;

Cao, Xianbin ;

Doermann, David ;

Shao, Ling .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6126-6135

[9] Selective Kernel Networks [J].

Li, Xiang ;

Wang, Wenhai ;

Hu, Xiaolin ;

Yang, Jian .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :510-519

[10] DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation [J].

Liu, Jiang ;

Gao, Chenqiang ;

Meng, Deyu ;

Hauptmann, Alexander G. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5197-5206

← 1 2 3 4 →