Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly Detection

被引:5
|
作者
Fu, Yan [1 ]
Yang, Bao [1 ]
Ye, Ou [1 ]
机构
[1] Xian Univ Sci & Technol, Sch Comp Sci & Technol, Xian 710054, Peoples R China
关键词
video anomaly detection; memory network; spatiotemporal masked autoencoder; vision transformer; skip connections;
D O I
10.3390/electronics13020353
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video anomaly detection is a critical component of intelligent video surveillance systems,extensively deployed and researched in industry and academia. However, existing methods have astrong generalization ability for predicting anomaly samples. They cannot utilize high-level semanticand temporal contextual information in videos, resulting in unstable prediction performance. Toalleviate this issue, we propose an encoder-decoder model named SMAMS, based on spatiotemporalmasked autoencoder and memory modules. First, we represent and mask some of the video eventsusing spatiotemporal cubes. Then, the unmasked patches are inputted into the spatiotemporalmasked autoencoder to extract high-level semantic and spatiotemporal features of the video events.Next, we add multiple memory modules to store unmasked video patches of different feature layers.Finally, skip connections are introduced to compensate for crucial information loss caused by thememory modules. Experimental results show that the proposed method outperforms state-of-the-artmethods, achieving AUC scores of 99.9%, 94.8%, and 78.9% on the UCSD Ped2, CUHK Avenue, andShanghai Tech datasets.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Memory-guided representation matching for unsupervised video anomaly detection
    Tao, Yiran
    Hu, Yaosi
    Chen, Zhenzhong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 101
  • [32] Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection
    Yang, Haiyan
    Liu, Shuning
    Wu, Mingxuan
    Chen, Hongbin
    Zeng, Delu
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (SUPPL 1) : 285 - 297
  • [33] Enhancing video anomaly detection with learnable memory network: A new approach to memory-based auto-encoders
    Wang, Zhiqiang
    Gu, Xiaojing
    Gu, Xingsheng
    Hu, Jingyu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241
  • [34] Multi-scale Siamese prediction network for video anomaly detection
    Jingxian Yang
    Yiheng Cai
    Dan Liu
    Jin Xie
    Signal, Image and Video Processing, 2023, 17 : 671 - 678
  • [35] Multi-scale Siamese prediction network for video anomaly detection
    Yang, Jingxian
    Cai, Yiheng
    Liu, Dan
    Xie, Jin
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (03) : 671 - 678
  • [36] Memory-augmented appearance-motion network for video anomaly detection
    Wang, Le
    Tian, Junwen
    Zhou, Sanping
    Shi, Haoyue
    Hua, Gang
    PATTERN RECOGNITION, 2023, 138
  • [37] Caption-Guided Interpretable Video Anomaly Detection Based on Memory Similarity
    Shi, Yuzhi
    Yamashita, Takayoshi
    Hirakawa, Tsubasa
    Fujiyoshi, Hironobu
    Nakazawa, Mitsuru
    Chae, Yeongnam
    Stenger, Bjorn
    IEEE ACCESS, 2024, 12 : 63995 - 64005
  • [38] Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets
    Habeb, Mohamed H.
    Salama, May
    Elrefaei, Lamiaa A.
    ALGORITHMS, 2024, 17 (07)
  • [39] Video anomaly detection based on frame memory bank and decoupled asymmetric convolutions
    Zhao, Min
    Wang, Chuanxu
    Li, Jiajiong
    Jiang, Zitai
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (05)
  • [40] Video anomaly detection via pseudo-anomaly generation and multi-grained feature learning
    Deng, Haigang
    Yang, Qingyang
    Li, Chengwei
    Liang, Hanzhong
    Wang, Chuanxu
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)