Weakly Supervised Video Anomaly Detection via Transformer-Enabled Temporal Relation Learning

被引:22
作者
Zhang, Dasheng [1 ]
Huang, Chao [2 ]
Liu, Chengliang [2 ]
Xu, Yong [2 ,3 ]
机构
[1] Chongqing Univ, Sch Artificial Intelligence, Chongqing 401135, Peoples R China
[2] Harbin Inst Technol, Shenzhen Key Lab Visual Object Detect & Recognit, Shenzhen 518055, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
国家重点研发计划;
关键词
Feature extraction; Transformers; Task analysis; Anomaly detection; Training; Surveillance; Training data; Deep learning; video anomaly detection; vision transformer; weakly-supervised learning;
D O I
10.1109/LSP.2022.3175092
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly supervised video anomaly detection is a challenging problem due to the lack of frame-level labels in training videos. Most previous works typically tackle this task with the multiple instance learning paradigm, which divides a video into multiple snippets and trains a snippet classifier to distinguish anomalies from normal snippets via video-level supervision information. Although existing approaches achieve remarkable progresses, these solutions are still limited in the insufficient representations. In this paper, we propose a novel weakly supervised temporal relation learning framework for anomaly detection, which efficiently explores the temporal relation between snippets and enhances the discriminative powers of features using only video-level labelled videos. To this end, we design a transformer-enabled feature encoder to convert the input task-agnostic features into discriminative task-specific features by mining the semantic correlation and position relation between video snippets. As a result, our model can make a more accurate anomaly detection for current video snippet based on the learned discriminative features. Experimental results indicate that the proposed method is superior to existing state-of-the-art approaches, which demonstrates the effectiveness of our model.
引用
收藏
页码:1197 / 1201
页数:5
相关论文
共 37 条
  • [31] Wu Peng, 2020, EUROPEAN C COMPUTER, P322
  • [32] Cross-Epoch Learning for Weakly Supervised Anomaly Detection in Surveillance Videos
    Yu, Shenghao
    Wang, Chong
    Mao, Qiaomei
    Li, Yuqi
    Wu, Jiafei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 2137 - 2141
  • [33] Zaheer Muhammad Zaigham, 2020, Computer Vision - ECCV 2020 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12367), P358, DOI 10.1007/978-3-030-58542-6_22
  • [34] A Self-Reasoning Framework for Anomaly Detection Using Video-Level Labels
    Zaheer, Muhammad Zaigham
    Mahmood, Arif
    Shin, Hochul
    Lee, Seung-Ik
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1705 - 1709
  • [35] Zaheer MZ, 2020, PROC CVPR IEEE, P14171, DOI 10.1109/CVPR42600.2020.01419
  • [36] Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection
    Zhong, Jia-Xing
    Li, Nannan
    Kong, Weijie
    Liu, Shan
    Li, Thomas H.
    Li, Ge
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1237 - 1246
  • [37] Zhu Y., 2019, BRIT MACH VIS C, P270