Dynamic Erasing Network With Adaptive Temporal Modeling for Weakly Supervised Video Anomaly Detection

被引:0
作者
Zhang, Chen [1 ,2 ]
Li, Guorong [3 ]
Qi, Yuankai [4 ]
Ye, Hanhua [3 ]
Qing, Laiyun [3 ]
Yang, Ming-Hsuan [5 ,6 ,7 ]
Huang, Qingming [3 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing 100085, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Key Lab Big Data Min & Knowledge Management, Beijing 100049, Peoples R China
[4] Macquarie Univ, Sch Comp, Sydney, NSW 2109, Australia
[5] Univ Calif Merced, Dept Elect Engn & Comp Sci, Merced, CA 95343 USA
[6] Yonsei Univ, Coll Comp, Seoul 03722, South Korea
[7] Google, Mountain View, CA 94043 USA
基金
中国国家自然科学基金;
关键词
Anomaly detection; Adaptation models; Feature extraction; Training; Weak supervision; Predictive models; Context modeling; Annotations; Adaptive systems; Visualization; Adaptive temporal modeling (ATM); dynamic erasing (DE); video anomaly detection; weak supervision; PREDICTION;
D O I
10.1109/TNNLS.2025.3553556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The weakly supervised video anomaly detection aims to learn a detection model using only video-level labeled data. Prior studies ignore the complexity or duration of anomalies present in abnormal videos during temporal modeling. Moreover, existing works usually detect the most abnormal segments, potentially overlooking the completeness of anomalies. We propose a dynamic erasing network (DE-Net) for weakly supervised video anomaly detection, which learns video-specific temporal features via adaptive temporal modeling (ATM) to address these limitations. Specifically, to handle duration variations of abnormal events, we propose an ATM module capable of adaptively selecting and aggregating the most appropriate K temporal scale features for each video. Then, we design a dynamic erasing (DE) strategy that dynamically assesses the completeness of the detected anomalies and erases prominent abnormal segments to encourage the model to discover gentle abnormal segments. The proposed method achieves favorable performance compared to several state-of-the-art approaches on the widely used XD-Violence, TAD, and UCF-Crime datasets.
引用
收藏
页数:15
相关论文
共 89 条
[61]  
Wu P, 2024, AAAI CONF ARTIF INTE, P6074
[62]   Learning Causal Temporal Relation and Feature Discrimination for Anomaly Detection [J].
Wu, Peng ;
Liu, Jing .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3513-3527
[63]   A Deep One-Class Neural Network for Anomalous Event Detection in Complex Scenes [J].
Wu, Peng ;
Liu, Jing ;
Shen, Fang .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) :2609-2622
[64]  
Xu D., 2015, P BRIT MACH VIS C, P1
[65]   Multi-Scale Structure-Aware Network for Weakly Supervised Temporal Action Detection [J].
Yang, Wenfei ;
Zhang, Tianzhu ;
Mao, Zhendong ;
Zhang, Yongdong ;
Tian, Qi ;
Wu, Feng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :5848-5861
[66]   Towards Video Anomaly Detection in the Real World: A Binarization Embedded Weakly-Supervised Network [J].
Yang, Zhen ;
Guo, Yuanfang ;
Wang, Junfu ;
Huang, Di ;
Bao, Xiuguo ;
Wang, Yunhong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) :4135-4140
[67]   Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection [J].
Yang, Zhiwei ;
Liu, Jing ;
Wu, Peng .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, :18899-18908
[68]   Hierarchical Modular Network for Video Captioning [J].
Ye, Hanhua ;
Li, Guorong ;
Qi, Yuankai ;
Wang, Shuhui ;
Huang, Qingming ;
Yang, Ming-Hsuan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :17918-17927
[69]   AnoPCN: Video Anomaly Detection via Deep Predictive Coding Network [J].
Ye, Muchao ;
Peng, Xiaojiang ;
Gan, Weihao ;
Wu, Wei ;
Qiao, Yu .
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :1805-1813
[70]   Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation [J].
Yoon, Sung-Hoon ;
Kweon, Hyeokjun ;
Cho, Jegyeong ;
Kim, Shinjeong ;
Yoon, Kuk-Jin .
COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 :326-344