Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly Detection

被引:5
|
作者
Fu, Yan [1 ]
Yang, Bao [1 ]
Ye, Ou [1 ]
机构
[1] Xian Univ Sci & Technol, Sch Comp Sci & Technol, Xian 710054, Peoples R China
关键词
video anomaly detection; memory network; spatiotemporal masked autoencoder; vision transformer; skip connections;
D O I
10.3390/electronics13020353
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video anomaly detection is a critical component of intelligent video surveillance systems,extensively deployed and researched in industry and academia. However, existing methods have astrong generalization ability for predicting anomaly samples. They cannot utilize high-level semanticand temporal contextual information in videos, resulting in unstable prediction performance. Toalleviate this issue, we propose an encoder-decoder model named SMAMS, based on spatiotemporalmasked autoencoder and memory modules. First, we represent and mask some of the video eventsusing spatiotemporal cubes. Then, the unmasked patches are inputted into the spatiotemporalmasked autoencoder to extract high-level semantic and spatiotemporal features of the video events.Next, we add multiple memory modules to store unmasked video patches of different feature layers.Finally, skip connections are introduced to compensate for crucial information loss caused by thememory modules. Experimental results show that the proposed method outperforms state-of-the-artmethods, achieving AUC scores of 99.9%, 94.8%, and 78.9% on the UCSD Ped2, CUHK Avenue, andShanghai Tech datasets.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Learning dual updatable memory modules for video anomaly detection
    Zhang, Liang
    Li, Shifeng
    Cheng, Yan
    Luo, Xi
    Liu, Xiaoru
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [22] Video anomaly detection with memory-guided multilevel embedding
    Zhou, Liuping
    Yang, Jing
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2023, 12 (01)
  • [23] Spatio-Temporal United Memory for Video Anomaly Detection
    Wang, Yunlong
    Chen, Mingyi
    Li, Jiaxin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 84 - 93
  • [24] Video anomaly detection with memory-guided multilevel embedding
    Liuping Zhou
    Jing Yang
    International Journal of Multimedia Information Retrieval, 2023, 12
  • [25] HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module
    Li, Hongjun
    Wang, Yunlong
    Chen, Mingyi
    Li, Jiaxin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (20) : 31521 - 31538
  • [26] HN-MUM: heterogeneous video anomaly detection network with multi-united-memory module
    Hongjun Li
    Yunlong Wang
    Mingyi Chen
    Jiaxin Li
    Multimedia Tools and Applications, 2023, 82 : 31521 - 31538
  • [27] Learning Spatiotemporal Features With 3DCNN and ConvGRU for Video Anomaly Detection
    Wang, Xin
    Xie, Weixin
    Song, Jiayi
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 474 - 479
  • [28] Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection
    Li, Zhangxun
    Zhao, Mengyang
    Zeng, Xinhua
    Wang, Tian
    Pang, Chengxin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 95 - 107
  • [29] Memory-enhanced appearance-motion consistency framework for video anomaly detection
    Ning, Zhiyuan
    Wang, Zile
    Liu, Yang
    Liu, Jing
    Song, Liang
    COMPUTER COMMUNICATIONS, 2024, 216 : 159 - 167
  • [30] A novel spatio-temporal memory network for video anomaly detection
    Li H.
    Chen M.
    Multimedia Tools and Applications, 2025, 84 (8) : 4603 - 4624