Lightweight multi-scale network with attention for accurate and efficient crowd counting

被引:2
作者
Xi, Mengyuan [1 ]
Yan, Hua [1 ]
机构
[1] Sichuan Univ, Coll Elect & Informat Engn, Chengdu 610065, Sichuan, Peoples R China
关键词
Crowd counting; Feature fusion; Lightweight network; Multi-scale; Normalized union loss; NEURAL-NETWORK;
D O I
10.1007/s00371-023-03099-z
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Crowd counting is a significant task in computer vision, which aims to estimate the total number of people appeared in images or videos. However, it is still very challenging due to the huge scale variation and uneven density distributions in dense scenes. Moreover, although many works have been presented to tackle these issues, these methods always have a large number of parameters and high computation complexity, which leads to a limitation to the wide applications in edge devices. In this work, we propose a lightweight method for accurate and efficient crowd counting, called lightweight multi-scale network with attention. It is mainly composed of four parts: lightweight extractor, multi-scale features extraction module (MFEM), attention-based fusion module (ABFM), and efficient density map regressor. We design the MFEM and ABFM delicately to obtain rich scale representations, which is significantly beneficial for improving the counting accuracy. Moreover, the normalized union loss function is proposed to balance contribution of samples with diverse density distributions. Extensive experiments carried out on six mainstream crowd datasets demonstrate that our proposed method achieves superior performance to the other state-of-the-art methods with a small model size and low computational cost.
引用
收藏
页码:4553 / 4566
页数:14
相关论文
共 61 条
[1]   PDANet: Pyramid density-aware attention based network for accurate crowd counting [J].
Amirgholipour, Saeed ;
Jia, Wenjing ;
Liu, Lei ;
Fan, Xiaochen ;
Wang, Dadong ;
He, Xiangjian .
NEUROCOMPUTING, 2021, 451 :215-230
[2]  
[Anonymous], P EUR C COMP VIS ECC
[3]  
Babu Sam D., P IEEE C COMP VIS PA, P5744
[4]   Flounder-Net: An efficient CNN for crowd counting by aerial photography [J].
Chen, Jingyu ;
Xiu, Shengjie ;
Chen, Xiang ;
Guo, Hao ;
Xie, Xiaohua .
NEUROCOMPUTING, 2021, 420 :82-89
[5]   Cross-Domain Few-Shot Classification based on Lightweight Res2Net and Flexible GNN [J].
Chen, Yu ;
Zheng, Yunan ;
Xu, Zhenyu ;
Tang, Tianhang ;
Tang, Zixin ;
Chen, Jie ;
Liu, Yiguang .
KNOWLEDGE-BASED SYSTEMS, 2022, 247
[6]   Toward Abnormal Trajectory and Event Detection in Video Surveillance [J].
Cosar, Serhan ;
Donatiello, Giuseppe ;
Bogorny, Vania ;
Garate, Carolina ;
Alvares, Luis Otavio ;
Bremond, Francois .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (03) :683-695
[7]  
Dosi M., 2021, 2021 16 IEEE INT C A, P1
[8]   ASMNet: a Lightweight Deep Neural Network for Face Alignment and Pose Estimation [J].
Fard, Ali Pourramezan ;
Abdollahi, Hojjat ;
Mahoor, Mohammad .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :1521-1530
[9]   Fast crowd density estimation with convolutional neural networks [J].
Fu, Min ;
Xu, Pei ;
Li, Xudong ;
Liu, Qihe ;
Ye, Mao ;
Zhu, Ce .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 43 :81-88
[10]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]