Efficiently adapting large pre-trained models for real-time violence recognition in smart city surveillance

被引:0
|
作者
Ren, Xiaohui [1 ]
Fan, Wenze [2 ]
Wang, Yinghao [1 ]
机构
[1] Liaocheng Univ, Sch Comp, Liaocheng 252059, Shandong, Peoples R China
[2] Inner Mongolia Univ Technol, Coll Mech Engn, Hohhot, Inner Mongolia, Peoples R China
关键词
Smart city; Surveillance video; Real-time violence recognition; Large pre-trained model; Parameter-efficient fine-tuning;
D O I
10.1007/s11554-024-01486-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the concept of smart cities has gained prominence, aiming to enhance urban efficiency, safety, and quality of life through advanced technologies. A critical component of this infrastructure is the extensive use of surveillance systems to monitor public spaces for violent behavior detection. As the scale of data and models grows, large-scale pre-trained models demonstrate remarkable capabilities across a wide range of applications. However, adapting these models for violence recognition in surveillance videos poses several challenges, including the fine-tuning cost, lack of temporal modeling, and inference overhead. In this paper, we propose an efficient recognition framework to adapt pre-trained models for violence behavior recognition, which consists of two paths, named spatial path and motion path. Our proposed framework allows for real-time parameter updating and real-time inference, which is adaptable to various ViT-based pre-trained models. Both paths adopt the pipeline of parameter-efficient fine-tuning to ensure the real-time performance of the model updating. What's more, within the motion path, as multiple frames need to be processed to capture temporal features, the real-time performance of the model is a challenge. Considering this, to improve the efficiency of inference, we compress multiple frames into the size of a single standard image, ensuring the real-time performance of inference. Experiments on five datasets demonstrate that our framework achieves state-of-the-art performance, efficiently transferring pre-trained large models to violence behavior recognition.
引用
收藏
页数:10
相关论文
共 1 条
  • [1] Benchmarking real-time vehicle data streaming models for a smart city
    Fernandez-Rodriguez, Jorge Y.
    Alvarez-Garcia, Juan A.
    Arias Fisteus, Jesus
    Luaces, Miguel R.
    Corcoba Magana, Victor
    INFORMATION SYSTEMS, 2017, 72 : 62 - 76