Transformer and Adaptive Threshold Sliding Window for Improving Violence Detection in Videos

被引:0
|
作者
Rendon-Segador, Fernando J. [1 ]
Alvarez-Garcia, Juan A. [1 ]
Soria-Morillo, Luis M. [1 ]
机构
[1] Univ Seville, Dept Lenguajes & Sistemas Informat, Seville 41012, Spain
关键词
deep learning; sliding window; transformer; violence detection; adaptive threshold;
D O I
10.3390/s24165429
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This paper presents a comprehensive approach to detect violent events in videos by combining CrimeNet, a Vision Transformer (ViT) model with structured neural learning and adversarial regularization, with an adaptive threshold sliding window model based on the Transformer architecture. CrimeNet demonstrates exceptional performance on all datasets (XD-Violence, UCF-Crime, NTU-CCTV Fights, UBI-Fights, Real Life Violence Situations, MediEval, RWF-2000, Hockey Fights, Violent Flows, Surveillance Camera Fights, and Movies Fight), achieving high AUC ROC and AUC PR values (up to 99% and 100%, respectively). However, the generalization of CrimeNet to cross-dataset experiments posed some problems, resulting in a 20-30% decrease in performance, for instance, training in UCF-Crime and testing in XD-Violence resulted in 70.20% in AUC ROC. The sliding window model with adaptive thresholding effectively solves these problems by automatically adjusting the violence detection threshold, resulting in a substantial improvement in detection accuracy. By applying the sliding window model as post-processing to CrimeNet results, we were able to improve detection accuracy by 10% to 15% in cross-dataset experiments. Future lines of research include improving generalization, addressing data imbalance, exploring multimodal representations, testing in real-world applications, and extending the approach to complex human interactions.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Sensor Fault Detection, Localization, and System Reconfiguration with a Sliding Mode Observer and Adaptive Threshold of PMSM
    Abderrezak, Aibeche
    Madjid, Kidouche
    JOURNAL OF POWER ELECTRONICS, 2016, 16 (03) : 1012 - 1024
  • [32] Violence Detection from Videos using HOG Features
    Das, Sunanda
    Sarker, Amlan
    Mahmud, Tareq
    2019 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT), 2019,
  • [33] Detection of Violence in Cartoon Videos Using Visual Features
    Khalil, Tahira
    Bangash, Javed Iqbal
    Khan, Abdul Waheed
    Lashari, Saima Anwar
    Khan, Abdullah
    Ramli, Dzati Athiar
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 4962 - 4971
  • [34] Adaptive Change Detection for Long-Term Machinery Monitoring Using Incremental Sliding-Window
    Teng Wang
    Guo-Liang Lu
    Jie Liu
    Peng Yan
    Chinese Journal of Mechanical Engineering, 2017, 30 (06) : 1338 - 1346
  • [35] Adaptive Change Detection for Long-Term Machinery Monitoring Using Incremental Sliding-Window
    Teng Wang
    Guo-Liang Lu
    Jie Liu
    Peng Yan
    Chinese Journal of Mechanical Engineering, 2017, 30 : 1338 - 1346
  • [36] Adaptive Change Detection for Long-Term Machinery Monitoring Using Incremental Sliding-Window
    Wang, Teng
    Lu, Guo-Liang
    Liu, Jie
    Yan, Peng
    CHINESE JOURNAL OF MECHANICAL ENGINEERING, 2017, 30 (06) : 1338 - 1346
  • [37] Autocorrelation of gradients based violence detection in surveillance videos
    Deepak, K.
    Vignesh, L. K. P.
    Chandrakala, S.
    ICT EXPRESS, 2020, 6 (03): : 155 - 159
  • [38] CrimeNet: Neural Structured Learning using Vision Transformer for violence detection
    Rendon-Segador, Fernando J.
    Alvarez-Garcia, Juan A.
    Salazar-Gonzalez, Jose L.
    Tommasi, Tatiana
    NEURAL NETWORKS, 2023, 161 : 318 - 329
  • [39] Sliding window based CAC for adaptive service in mobile network
    Zhao, P
    Zhang, HM
    13TH IEEE INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, VOL 1-5, PROCEEDINGS: SAILING THE WAVES OF THE WIRELESS OCEANS, 2002, : 2165 - 2169
  • [40] Anomaly detection in surveillance videos using Transformer with margin learning
    Wang, Dicong
    Wu, Kaijun
    MULTIMEDIA SYSTEMS, 2024, 30 (05)