Inter-patch spatio-temporal relation prediction for video anomaly detection

被引:0
|
作者
Hao Shen [1 ]
Lu Shi [2 ]
Linna Zhang [3 ]
Wanru Xu [1 ]
Yigang Cen [2 ]
Gaoyun An [3 ]
机构
[1] Beijing Jiaotong University,State Key Laboratory of Advanced Rail Autonomous Operation
[2] Beijing Jiaotong University,The School of Computer Science and Technology
[3] Beijing Jiaotong University,Visual Intellgence +X International Cooperation Joint Laboratory of MOE
[4] Guizhou University,School of Mechanical Engineering
关键词
Video anomaly detection; Self-supervised learning; Pretext task;
D O I
10.1007/s11760-025-04156-x
中图分类号
学科分类号
摘要
Video anomaly detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent video surveillance systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task. Specifically, we introduce a two-branch vision transformer network designed to capture deep visual features of video frames, which can address spatial and temporal dimensions responsible for modeling appearance and motion patterns, respectively. The inter-patch relationship in each dimension is decoupled into inter-patch similarity and the order information of each patch. To mitigate memory consumption, we convert the order information prediction task into a multi-label learning problem, and the inter-patch similarity prediction task into a inter-patch distance matrix regression problem. Comprehensive experiments demonstrate the effectiveness of our method, surpassing pixel-generation-based methods by a significant margin across three public benchmarks. Additionally, our approach outperforms other self-supervised learning-based methods.
引用
收藏
相关论文
共 50 条
  • [1] Video anomaly detection with spatio-temporal dissociation
    Chang, Yunpeng
    Tu, Zhigang
    Xie, Wei
    Luo, Bin
    Zhang, Shifu
    Sui, Haigang
    Yuan, Junsong
    PATTERN RECOGNITION, 2022, 122
  • [2] Spatio-Temporal AutoEncoder for Video Anomaly Detection
    Zhao, Yiru
    Deng, Bing
    Shen, Chen
    Liu, Yao
    Lu, Hongtao
    Hua, Xian-Sheng
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1933 - 1941
  • [3] Transformer with Spatio-Temporal Representation for Video Anomaly Detection
    Sun, Xiaohu
    Chen, Jinyi
    Shen, Xulin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 213 - 222
  • [4] Spatio-Temporal United Memory for Video Anomaly Detection
    Wang, Yunlong
    Chen, Mingyi
    Li, Jiaxin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 84 - 93
  • [5] VIDEO ANOMALY DETECTION VIA PREDICTION NETWORK WITH ENHANCED SPATIO-TEMPORAL MEMORY EXCHANGE
    Shen, Guodong
    Ouyang, Yuqi
    Sanchez, Victor
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3728 - 3732
  • [6] A novel spatio-temporal memory network for video anomaly detection
    Li H.
    Chen M.
    Multimedia Tools and Applications, 2025, 84 (8) : 4603 - 4624
  • [7] Associative Memory With Spatio-Temporal Enhancement for Video Anomaly Detection
    Zhong, Yuanhong
    Hu, Yongting
    Tang, Panliang
    Wang, Heng
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1212 - 1216
  • [8] Video anomaly detection based on spatio-temporal relationships among objects
    Wang, Yang
    Liu, Tianying
    Zhou, Jiaogen
    Guan, Jihong
    NEUROCOMPUTING, 2023, 532 : 141 - 151
  • [9] Normal Spatio-Temporal Information Enhance for Unsupervised Video Anomaly Detection
    Wang, Jun
    Jia, Di
    Huang, Ziqing
    Zhang, Miaohui
    Ren, Xing
    NEURAL PROCESSING LETTERS, 2023, 55 (08) : 10727 - 10745
  • [10] Normal Spatio-Temporal Information Enhance for Unsupervised Video Anomaly Detection
    Jun Wang
    Di Jia
    Ziqing Huang
    Miaohui Zhang
    Xing Ren
    Neural Processing Letters, 2023, 55 : 10727 - 10745