Inter-patch spatio-temporal relation prediction for video anomaly detection

被引:0
|
作者
Hao Shen [1 ]
Lu Shi [2 ]
Linna Zhang [3 ]
Wanru Xu [1 ]
Yigang Cen [2 ]
Gaoyun An [3 ]
机构
[1] Beijing Jiaotong University,State Key Laboratory of Advanced Rail Autonomous Operation
[2] Beijing Jiaotong University,The School of Computer Science and Technology
[3] Beijing Jiaotong University,Visual Intellgence +X International Cooperation Joint Laboratory of MOE
[4] Guizhou University,School of Mechanical Engineering
关键词
Video anomaly detection; Self-supervised learning; Pretext task;
D O I
10.1007/s11760-025-04156-x
中图分类号
学科分类号
摘要
Video anomaly detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent video surveillance systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task. Specifically, we introduce a two-branch vision transformer network designed to capture deep visual features of video frames, which can address spatial and temporal dimensions responsible for modeling appearance and motion patterns, respectively. The inter-patch relationship in each dimension is decoupled into inter-patch similarity and the order information of each patch. To mitigate memory consumption, we convert the order information prediction task into a multi-label learning problem, and the inter-patch similarity prediction task into a inter-patch distance matrix regression problem. Comprehensive experiments demonstrate the effectiveness of our method, surpassing pixel-generation-based methods by a significant margin across three public benchmarks. Additionally, our approach outperforms other self-supervised learning-based methods.
引用
收藏
相关论文
共 50 条
  • [31] Video object segmentation using spatio-temporal deep network
    Ramaswamy, Akshaya
    Gubbi, Jayavardhana
    Balamuralidhar, P.
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [32] Learning Spatio-temporal Representation by Channel Aliasing Video Perception
    Lin, Yiqi
    Wang, Jinpeng
    Zhang, Manlin
    Ma, Andy J.
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2317 - 2325
  • [33] SATJiP: Spatial and Augmented Temporal Jigsaw Puzzles for Video Anomaly Detection
    Shen, Liheng
    Matsukalw, Tetsu
    Suzuki, Einoshin
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I, PAKDD 2024, 2024, 14645 : 27 - 40
  • [34] Weakly Supervised Video Anomaly Detection via Transformer-Enabled Temporal Relation Learning
    Zhang, Dasheng
    Huang, Chao
    Liu, Chengliang
    Xu, Yong
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1197 - 1201
  • [35] HSTforU: anomaly detection in aerial and ground-based videos with hierarchical spatio-temporal transformer for U-net
    Le, Viet-Tuan
    Jin, Hulin
    Kim, Yong-Guk
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [36] Future Video Prediction from a Single Frame for Video Anomaly Detection
    Baradaran, Mohammad
    Bergevin, Robert
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT I, 2023, 14361 : 472 - 486
  • [37] Future Frame Prediction Network for Video Anomaly Detection
    Luo, Weixin
    Liu, Wen
    Lian, Dongze
    Gao, Shenghua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 7505 - 7520
  • [38] Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics
    Wang, Jiangliu
    Jiao, Jianbo
    Bao, Linchao
    He, Shengfeng
    Liu, Wei
    Liu, Yun-hui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3791 - 3806
  • [39] Leveraging Trajectory Prediction for Pedestrian Video Anomaly Detection
    Kanu-Asiegbu, Asiegbu Miracle
    Vasudevan, Ram
    Du, Xiaoxiao
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [40] Exploiting Spatial-temporal Correlations for Video Anomaly Detection
    Zhao, Mengyang
    Liu, Yang
    Liu, Jing
    Zeng, Xinhua
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1727 - 1733