Cascade-guided multi-scale attention network for crowd counting

被引：2

作者：

Li, Shufang ^{[1
,2
]}

Hu, Zhengping ^{[1
]}

Zhao, Mengyao ^{[1
]}

Sun, Zhe ^{[1
]}

机构：

[1] Yanshan Univ, Sch Informat Sci & Engn, West Hebei St 438, Qinhuangdao 066004, Hebei, Peoples R China

[2] Hebei Univ Environm Engn, Dept Informat Engn, Jingang Rd 8, Qinhuangdao 066102, Hebei, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2021年 / 15卷 / 08期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Crowd counting; Cascade-guided scale-aware; Attention-aware; High-quality density map;

D O I：

10.1007/s11760-021-01903-8

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The performance of crowd counting based on density estimation has been greatly improved with the development of deep learning. However, it is still a major issue to obtain high-quality density map due to the clutter of background, as well as the interference of perspective changes within and between scenes. In this paper, we propose a cascade-guided crowd counting network, which is mainly embedded with scale aware model (SAM) and attention aware model (AAM). First, SAM considers share-net design and multi-directional perspective transform in convolution to deal with multi-scale varying and smooth transition, while reducing the background noise in shallow features. Second, AAM further encodes the semantic inter dependencies by using the two-dimensional features of location and channel in order to let the network learn to pay attention to the key information. Finally, the global and local features are concatenated and taken into decoder to generate the estimated density map for crowd counting. Comprehensive experiments based on three established datasets show that the proposed method not only has higher accuracy, but also has stronger robustness to scale variation and background noise.

引用

页码：1663 / 1670

页数：8

共 33 条

[1] Scale Aggregation Network for Accurate and Efficient Crowd Counting [J].

Cao, Xinkun ;

Wang, Zhipeng ;

Zhao, Yanyun ;

Su, Fei .

COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 :757-773

[2] Crowd counting with crowd attention convolutional neural network [J].

Chen, Jiwei ;

Su, Wen ;

Wang, Zengfu .

NEUROCOMPUTING, 2020, 382 :210-220

[3] Scale-Recursive Network with point supervision for crowd scene analysis [J].

Dong, Zihao ;

Zhang, Ruixun ;

Shao, Xiuli ;

Li, Yumeng .

NEUROCOMPUTING, 2020, 384 :314-324

[4] Dual Attention Network for Scene Segmentation [J].

Fu, Jun ;

Liu, Jing ;

Tian, Haijie ;

Li, Yong ;

Bao, Yongjun ;

Fang, Zhiwei ;

Lu, Hanqing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149

[5] PCC Net: Perspective Crowd Counting via Spatial Convolutional Network [J].

Gao, Junyu ;

Wang, Qi ;

Li, Xuelong .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) :3486-3498

[6] SCAR: Spatial-/channel-wise attention regression networks for crowd counting [J].

Gao, Junyu ;

Wang, Qi ;

Yuan, Yuan .

NEUROCOMPUTING, 2019, 363 :1-8

[7]

He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]

[8] Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds [J].

Idrees, Haroon ;

Tayyab, Muhmmad ;

Athrey, Kishan ;

Zhang, Dong ;

Al-Maadeed, Somaya ;

Rajpoot, Nasir ;

Shah, Mubarak .

COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :544-559

[9] Multi-Source Multi-Scale Counting in Extremely Dense Crowd Images [J].

Idrees, Haroon ;

Saleemi, Imran ;

Seibert, Cody ;

Shah, Mubarak .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2547-2554

[10] Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks [J].

Jiang, Xiaolong ;

Xiao, Zehao ;

Zhang, Baochang ;

Zhen, Xiantong ;

Cao, Xianbin ;

Doermann, David ;

Shao, Ling .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6126-6135

← 1 2 3 4 →