CLFormer: a unified transformer-based framework for weakly supervised crowd counting and localization

被引:0
作者
Mingfang Deng
Huailin Zhao
Ming Gao
机构
[1] Shanghai Institute of Technology,School of Electrical and Electronic Engineering
关键词
Shunted Transformer; Weakly supervised learning; Crowd counting; Crowd localization;
D O I
暂无
中图分类号
学科分类号
摘要
Recent progress in crowd counting and localization methods mainly relies on expensive point-level annotations and convolutional neural networks with limited receptive filed, which hinders their applications in complex real-world scenes. To this end, we present CLFormer, a Transformer-based weakly supervised crowd counting and localization framework. The model extracts global information from the input image using a Transformer and then passes the extracted features to both a regression branch for crowd counting and a localization branch for localization. Initial proposals are produced by the localization branch and filtered via score maps generated from the extracted features, and their centers are used as pseudo-point-level annotations. Through staggered training of the two branches, the quality of pseudo-point-level annotations is improved, and the final localization maps are generated. Experiments on four benchmark datasets (i.e., ShanghaiTech, UCF-QNRF, JHU-CROWD++, and NWPU-Crowd) demonstrate that CLFormer obtains better counting performance than weakly supervised and fully supervised counting networks and comparable localization performance to fully supervised localization networks.
引用
收藏
页码:1053 / 1067
页数:14
相关论文
共 50 条
  • [21] An interactive network based on transformer for multimodal crowd counting
    Yu, Ying
    Cai, Zhen
    Miao, Duoqian
    Qian, Jin
    Tang, Hong
    APPLIED INTELLIGENCE, 2023, 53 (19) : 22602 - 22614
  • [22] An interactive network based on transformer for multimodal crowd counting
    Ying Yu
    Zhen Cai
    Duoqian Miao
    Jin Qian
    Hong Tang
    Applied Intelligence, 2023, 53 : 22602 - 22614
  • [23] Self-attention Guidance Based Crowd Localization and Counting
    Ma, Zhouzhou
    Gu, Guanghua
    Zhao, Wenrui
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (05) : 966 - 982
  • [24] Diffusion-based framework for weakly-supervised temporal action localization
    Zou, Yuanbing
    Zhao, Qingjie
    Sarker, Prodip Kumar
    Li, Shanshan
    Wang, Lei
    Liu, Wangwang
    Pattern Recognition, 2025, 160
  • [25] Multi-Level Dynamic Graph Convolutional Networks for Weakly Supervised Crowd Counting
    Miao, Zhuangzhuang
    Zhang, Yong
    Ren, Hao
    Hu, Yongli
    Yin, Baocai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 3483 - 3495
  • [26] CrowdNeXt: Boosting Weakly Supervised Crowd Counting With Dual-Path Feature Aggregation and a Robust Loss Function
    Savner, Siddharth Singh
    Kanhangad, Vivek
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [27] Improving Point-Based Crowd Counting and Localization Based on Auxiliary Point Guidance
    Chen, I-Hsiang
    Chen, Wei-Ting
    Liu, Yu-Wei
    Yang, Ming-Hsuan
    Kuo, Sy-Yen
    COMPUTER VISION - ECCV 2024, PT XXIV, 2025, 15082 : 428 - 444
  • [28] Semi-supervised Crowd Counting based on Patch Crowds Statistics
    Peng, Sifan
    Yin, Baoqun
    Xia, Yinfeng
    Yang, Qianqian
    Wang, Luyang
    2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 749 - 755
  • [29] D2PT: Density to Point Transformer with Knowledge Distillation for Crowd Counting and Localization
    Li, Fan
    Yang, Enze
    Li, Chao
    Liu, Shuoyan
    Wang, Haodong
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2025, E108D (02) : 165 - 168
  • [30] A Semi-supervised crowd counting method based on patch crowds statistics
    Peng, Sifan
    Yin, Baoqun
    Xia, Yinfeng
    Yang, Qianqian
    Wang, Luyang
    PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (04)