Inter-patch spatio-temporal relation prediction for video anomaly detection

被引：0

作者：

Hao Shen ^{[1
]}

Lu Shi ^{[2
]}

Linna Zhang ^{[3
]}

Wanru Xu ^{[1
]}

Yigang Cen ^{[2
]}

Gaoyun An ^{[3
]}

机构：

[1] Beijing Jiaotong University,State Key Laboratory of Advanced Rail Autonomous Operation

[2] Beijing Jiaotong University,The School of Computer Science and Technology

[3] Beijing Jiaotong University,Visual Intellgence +X International Cooperation Joint Laboratory of MOE

[4] Guizhou University,School of Mechanical Engineering

来源：

Signal, Image and Video Processing | 2025年 / 19卷 / 7期

关键词：

Video anomaly detection; Self-supervised learning; Pretext task;

D O I：

10.1007/s11760-025-04156-x

中图分类号：

学科分类号：

摘要：

Video anomaly detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent video surveillance systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task. Specifically, we introduce a two-branch vision transformer network designed to capture deep visual features of video frames, which can address spatial and temporal dimensions responsible for modeling appearance and motion patterns, respectively. The inter-patch relationship in each dimension is decoupled into inter-patch similarity and the order information of each patch. To mitigate memory consumption, we convert the order information prediction task into a multi-label learning problem, and the inter-patch similarity prediction task into a inter-patch distance matrix regression problem. Comprehensive experiments demonstrate the effectiveness of our method, surpassing pixel-generation-based methods by a significant margin across three public benchmarks. Additionally, our approach outperforms other self-supervised learning-based methods.

引用

共 50 条

[31] Video object segmentation using spatio-temporal deep network
Ramaswamy, Akshaya
Gubbi, Jayavardhana
Balamuralidhar, P.
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[32] Learning Spatio-temporal Representation by Channel Aliasing Video Perception
Lin, Yiqi
Wang, Jinpeng
Zhang, Manlin
Ma, Andy J.
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2317 - 2325
[33] SATJiP: Spatial and Augmented Temporal Jigsaw Puzzles for Video Anomaly Detection
Shen, Liheng
Matsukalw, Tetsu
Suzuki, Einoshin
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I, PAKDD 2024, 2024, 14645 : 27 - 40
[34] Weakly Supervised Video Anomaly Detection via Transformer-Enabled Temporal Relation Learning
Zhang, Dasheng
Huang, Chao
Liu, Chengliang
Xu, Yong
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1197 - 1201
[35] HSTforU: anomaly detection in aerial and ground-based videos with hierarchical spatio-temporal transformer for U-net
Le, Viet-Tuan
Jin, Hulin
Kim, Yong-Guk
APPLIED INTELLIGENCE, 2025, 55 (04)
[36] Future Video Prediction from a Single Frame for Video Anomaly Detection
Baradaran, Mohammad
Bergevin, Robert
ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT I, 2023, 14361 : 472 - 486
[37] Future Frame Prediction Network for Video Anomaly Detection
Luo, Weixin
Liu, Wen
Lian, Dongze
Gao, Shenghua
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 7505 - 7520
[38] Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics
Wang, Jiangliu
Jiao, Jianbo
Bao, Linchao
He, Shengfeng
Liu, Wei
Liu, Yun-hui
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3791 - 3806
[39] Leveraging Trajectory Prediction for Pedestrian Video Anomaly Detection
Kanu-Asiegbu, Asiegbu Miracle
Vasudevan, Ram
Du, Xiaoxiao
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[40] Exploiting Spatial-temporal Correlations for Video Anomaly Detection
Zhao, Mengyang
Liu, Yang
Liu, Jing
Zeng, Xinhua
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1727 - 1733

← 1 2 3 4 5 →