JMFEEL-Net: a joint multi-scale feature enhancement and lightweight transformer network for crowd counting

被引：2

作者：

Wang, Mingtao ^{[1
]}

Zhou, Xin ^{[1
]}

Chen, Yuanyuan ^{[1
]}

机构：

[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Sichuan, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2024年 / 66卷 / 05期

关键词：

Crowd counting; Count estimation; Multi-scale variations; Multi-density map supervision; PEOPLE; SCALE; MODEL;

D O I：

10.1007/s10115-023-02056-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Crowd counting based on convolutional neural networks (CNNs) has made significant progress in recent years. However, the limited receptive field of CNNs makes it challenging to capture global features for comprehensive contextual modeling, resulting in insufficient accuracy in count estimation. In comparison, vision transformer (ViT)-based counting networks have demonstrated remarkable performance by exploiting their powerful global contextual modeling capabilities. However, ViT models are associated with higher computational costs and training difficulty. In this paper, we propose a novel network named JMFEEL-Net, which utilizes joint multi-scale feature enhancement and lightweight transformer to improve crowd counting accuracy. Specifically, we use a high-resolution CNN as the backbone network to generate high-resolution feature maps. In the backend network, we propose a multi-scale feature enhancement module to address the problem of low recognition accuracy caused by multi-scale variations, especially when counting small-scale objects in dense scenes. Furthermore, we introduce an improved lightweight ViT encoder to effectively model complex global contexts. We also adopt a multi-density map supervision strategy to learn crowd distribution features from feature maps of different resolutions, thereby improving the quality and training efficiency of the density maps. To validate the effectiveness of the proposed method, we conduct extensive experiments on four challenging datasets, namely ShanghaiTech Part A/B, UCF-QNRF, and JHU-Crowd++, achieving very competitive counting performance.

引用

页码：3033 / 3053

页数：21

共 50 条

[11] Multi-scale supervised network for crowd counting
Wang, Yongjie
Zhang, Wei
Huang, Dongxiao
Liu, Yanyan
Zhu, Jianghua
IET IMAGE PROCESSING, 2020, 14 (17) : 4701 - 4707
[12] MSFFA: a multi-scale feature fusion and attention mechanism network for crowd counting
Li, Zhaoxin
Lu, Shuhua
Dong, Yishan
Guo, Jingyuan
VISUAL COMPUTER, 2023, 39 (03): : 1045 - 1056
[13] MSFFA: a multi-scale feature fusion and attention mechanism network for crowd counting
Zhaoxin Li
Shuhua Lu
Yishan Dong
Jingyuan Guo
The Visual Computer, 2023, 39 : 1045 - 1056
[14] Deep feature network with multi-scale fusion for highly congested crowd counting
Leilei Yan
Li Zhang
Xiaohan Zheng
Fanzhang Li
International Journal of Machine Learning and Cybernetics, 2024, 15 : 819 - 835
[15] MSFFNet: multi-scale feature fusion network with semantic optimization for crowd counting
Rohra, Avinash
Yin, Baoqun
Bilal, Hazrat
Kumar, Aakash
Ali, Munawar
Li, Yang
PATTERN ANALYSIS AND APPLICATIONS, 2025, 28 (01)
[16] Deep feature network with multi-scale fusion for highly congested crowd counting
Yan, Leilei
Zhang, Li
Zheng, Xiaohan
Li, Fanzhang
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (03) : 819 - 835
[17] LigMSANet: Lightweight multi-scale adaptive convolutional neural network for dense crowd counting
Jiang, Guoquan
Wu, Rui
Huo, Zhanqiang
Zhao, Cuijun
Luo, Junwei
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 197
[18] MLANet: multi-level attention network with multi-scale feature fusion for crowd counting
Xiong, Liyan
Zeng, Yijuan
Huang, Xiaohui
Li, Zhida
Huang, Peng
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 6591 - 6608
[19] Redesigning Multi-Scale Neural Network for Crowd Counting
Du, Zhipeng
Shi, Miaojing
Deng, Jiankang
Zafeiriou, Stefanos
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3664 - 3678
[20] Crowd Counting based on Multi-level Multi-scale Feature
Di Wu
Zheyi Fan
Shuhan Yi
Applied Intelligence, 2023, 53 : 21891 - 21901

← 1 2 3 4 5 →