A Multi-Scale Feature Fusion Network With Cascaded Supervision for Cross-Scene Crowd Counting

被引:0
|
作者
Zhang, Xinfeng [1 ]
Han, Lina [1 ]
Shan, Wencong [1 ]
Wang, Xiaohu [1 ]
Chen, Shuhan [1 ]
Zhu, Congcong [1 ]
Li, Bin [1 ]
机构
[1] Yangzhou Univ, Coll Informat Engn, Coll Artificial Intelligence, Jiangsu Prov Engn Res Ctr Knowledge Management & I, Yangzhou 225127, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Training; Image resolution; Location awareness; Annotations; Testing; Training data; Background suppression (BS) loss; cascaded supervision; crowd counting; dilated convolution; multi-scale feature fusion; SCALE;
D O I
10.1109/TIM.2023.3246534
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Counting the number of people in public places has received much attention, and researchers have devoted much effort to the task. However, the existing crowd counting approaches are mainly trained and tested in similar scenarios. The performance of crowd counting approaches degrades sharply when the test scenarios of the models are of different types from its training scenes. In practice, the crowd scenes are highly variable, and the lack of cross-scene capability could seriously limit the application of the existing approaches. We attribute the improvement in cross-scene crowd counting capability to the necessity of accommodating large changes in the scale of individuals and the ability to suppress the interference of cluttered backgrounds. To this end, we propose a multi-scale feature fusion network (MFFNet) with cascaded supervision. The multi-scale features extracted from the crowd images are upsampled and then combined into several feature blocks, followed by convolution and deconvolution operations on the feature blocks to derive feature matrices of different resolutions. The feature matrices are fused from bottom to top. In the process of feature fusion, the crowd density maps corresponding to the feature matrices of different resolutions are predicted separately. We devise cascaded supervision to synchronously optimize the network of different resolution density map prediction during training. The cross-scene crowd counting experiments are conducted on four types of scenes: ShanghaiTech Part_A (SHT A) with high-density crowd scenes and small-scale individuals, ShanghaiTech Part_B (SHT B) with sparse crowd distribution and medium-scale individuals, UCF_CC_50 dataset with extremely dense scenes and tiny scale individuals, and UCF-QNRF dataset with extreme variations. MFFNet exhibits the strongest scene adaptability relative to the state-of-the-art approaches, with an average decrease of 17.1% and 8.4% in mean absolute error (MAE) and mean square error (mse), respectively. The contributions of different components in our method are verified in the ablation study using the devised evaluation metrics. Our implementation will be available at https://github.com/learnsharing/MFFNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Multi-scale supervised network for crowd counting
    Wang, Yongjie
    Zhang, Wei
    Huang, Dongxiao
    Liu, Yanyan
    Zhu, Jianghua
    IET IMAGE PROCESSING, 2020, 14 (17) : 4701 - 4707
  • [22] MFP-Net: Multi-scale feature pyramid network for crowd counting
    Lei, Tao
    Zhang, Dong
    Wang, Risheng
    Li, Shuying
    Zhang, Weijiang
    Nandi, Asoke K.
    IET IMAGE PROCESSING, 2021, 15 (14) : 3522 - 3533
  • [23] Crowd Counting Algorithm for Multi-Scale Fusion Based on Dual Branch Feature Extraction
    Zeng, Yunyun
    Zhang, Hongying
    Yuan, Mingdong
    Computer Engineering and Applications, 60 (20): : 224 - 232
  • [24] Cross-scene Crowd Counting via FCN and Gaussian Model
    Liu, Hao
    Li, Yadong
    Zhou, Zhong
    Wu, Wei
    2016 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2016), 2016, : 148 - 153
  • [25] A Crowd Counting and Localization Network Based on Adaptive Feature Fusion and Multi-Scale Global Attention Up Sampling
    Wang, Min
    Huang, Li
    Yan, Jingke
    Huang, Jin
    Yang, Tao
    IEEE ACCESS, 2024, 12 : 12919 - 12939
  • [26] Redesigning Multi-Scale Neural Network for Crowd Counting
    Du, Zhipeng
    Shi, Miaojing
    Deng, Jiankang
    Zafeiriou, Stefanos
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3664 - 3678
  • [27] Crowd Counting based on Multi-level Multi-scale Feature
    Di Wu
    Zheyi Fan
    Shuhan Yi
    Applied Intelligence, 2023, 53 : 21891 - 21901
  • [28] Multi-Scale Guided Attention Network for Crowd Counting
    Li, Pengfei
    Zhang, Min
    Wan, Jian
    Jiang, Ming
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [29] Multi-scale Attention Recalibration Network for crowd counting
    Xie, Jinyang
    Pang, Chen
    Zheng, Yanjun
    Li, Liang
    Lyu, Chen
    Lyu, Lei
    Liu, Hong
    APPLIED SOFT COMPUTING, 2022, 117
  • [30] STOCHASTIC MULTI-SCALE AGGREGATION NETWORK FOR CROWD COUNTING
    Wang, Mingjie
    Cai, Hao
    Zhou, Jun
    Gong, Minglun
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2008 - 2012