Inter-patch spatio-temporal relation prediction for video anomaly detection

被引：0

作者：

Hao Shen ^{[1
]}

Lu Shi ^{[2
]}

Linna Zhang ^{[3
]}

Wanru Xu ^{[1
]}

Yigang Cen ^{[2
]}

Gaoyun An ^{[3
]}

机构：

[1] Beijing Jiaotong University,State Key Laboratory of Advanced Rail Autonomous Operation

[2] Beijing Jiaotong University,The School of Computer Science and Technology

[3] Beijing Jiaotong University,Visual Intellgence +X International Cooperation Joint Laboratory of MOE

[4] Guizhou University,School of Mechanical Engineering

来源：

Signal, Image and Video Processing | 2025年 / 19卷 / 7期

关键词：

Video anomaly detection; Self-supervised learning; Pretext task;

D O I：

10.1007/s11760-025-04156-x

中图分类号：

学科分类号：

摘要：

Video anomaly detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent video surveillance systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task. Specifically, we introduce a two-branch vision transformer network designed to capture deep visual features of video frames, which can address spatial and temporal dimensions responsible for modeling appearance and motion patterns, respectively. The inter-patch relationship in each dimension is decoupled into inter-patch similarity and the order information of each patch. To mitigate memory consumption, we convert the order information prediction task into a multi-label learning problem, and the inter-patch similarity prediction task into a inter-patch distance matrix regression problem. Comprehensive experiments demonstrate the effectiveness of our method, surpassing pixel-generation-based methods by a significant margin across three public benchmarks. Additionally, our approach outperforms other self-supervised learning-based methods.

引用

共 50 条

[21] DAST-Net: Dense visual attention augmented spatio-temporal network for unsupervised video anomaly detection
Kommanduri, Rangachary
Ghorai, Mrinmoy
NEUROCOMPUTING, 2024, 579
[22] Video anomaly detection based on cross-frame prediction mechanism and spatio-temporal memory-enhanced pseudo-3D encoder
Wen, Xiaopeng
Lai, Huicheng
Gao, Guxue
Xiao, Yang
Wang, Tongguan
Jia, Zhenhong
Wang, Liejun
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
[23] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
Li, Maosen
Li, Xurong
Yu, Kun
Deng, Cheng
Huang, Heng
Mao, Feng
Xue, Hui
Li, Minghao
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
[24] Video Anomaly Detection Based on Optical Flow Feature Enhanced Spatio-Temporal Feature Network FusionNet-LSTM-G
Song, Jun-Fang
Zhao, Hai-Li
Wen, Duo-Yang
Xu, Xiao-Yu
IEEE ACCESS, 2022, 10 : 130314 - 130325
[25] Video representation learning by identifying spatio-temporal transformations
Sheng Geng
Shimin Zhao
Hu Liu
Applied Intelligence, 2022, 52 : 6613 - 6622
[26] Video representation learning by identifying spatio-temporal transformations
Geng, Sheng
Zhao, Shimin
Liu, Hu
APPLIED INTELLIGENCE, 2022, 52 (06) : 6613 - 6622
[27] Unsupervised anomalous event detection in videos using spatio-temporal inter-fused autoencoder
Nazia Aslam
Maheshkumar H Kolekar
Multimedia Tools and Applications, 2022, 81 : 42457 - 42482
[28] Unsupervised anomalous event detection in videos using spatio-temporal inter-fused autoencoder
Aslam, Nazia
Kolekar, Maheshkumar H.
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42457 - 42482
[29] Spatio-temporal context analysis within video volumes for anomalous-event detection and localization
Li, Nannan
Wu, Xinyu
Xu, Dan
Guo, Huiwen
Feng, Wei
NEUROCOMPUTING, 2015, 155 : 309 - 319
[30] STAD-AI: Spatio-Temporal Anomaly Detection in Videos with Attentive Dual-Stage Integration
Kommanduri, Rangachary
Ghorai, Mrinmoy
NEUROCOMPUTING, 2025, 634

← 1 2 3 4 5 →