共 1 条
Efficiently adapting large pre-trained models for real-time violence recognition in smart city surveillance
被引:0
|作者:
Ren, Xiaohui
[1
]
Fan, Wenze
[2
]
Wang, Yinghao
[1
]
机构:
[1] Liaocheng Univ, Sch Comp, Liaocheng 252059, Shandong, Peoples R China
[2] Inner Mongolia Univ Technol, Coll Mech Engn, Hohhot, Inner Mongolia, Peoples R China
关键词:
Smart city;
Surveillance video;
Real-time violence recognition;
Large pre-trained model;
Parameter-efficient fine-tuning;
D O I:
10.1007/s11554-024-01486-w
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Recently, the concept of smart cities has gained prominence, aiming to enhance urban efficiency, safety, and quality of life through advanced technologies. A critical component of this infrastructure is the extensive use of surveillance systems to monitor public spaces for violent behavior detection. As the scale of data and models grows, large-scale pre-trained models demonstrate remarkable capabilities across a wide range of applications. However, adapting these models for violence recognition in surveillance videos poses several challenges, including the fine-tuning cost, lack of temporal modeling, and inference overhead. In this paper, we propose an efficient recognition framework to adapt pre-trained models for violence behavior recognition, which consists of two paths, named spatial path and motion path. Our proposed framework allows for real-time parameter updating and real-time inference, which is adaptable to various ViT-based pre-trained models. Both paths adopt the pipeline of parameter-efficient fine-tuning to ensure the real-time performance of the model updating. What's more, within the motion path, as multiple frames need to be processed to capture temporal features, the real-time performance of the model is a challenge. Considering this, to improve the efficiency of inference, we compress multiple frames into the size of a single standard image, ensuring the real-time performance of inference. Experiments on five datasets demonstrate that our framework achieves state-of-the-art performance, efficiently transferring pre-trained large models to violence behavior recognition.
引用
收藏
页数:10
相关论文