Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning

被引:233
作者
Tian, Yu [1 ,3 ]
Pang, Guansong [1 ]
Chen, Yuanhong [1 ]
Singh, Rajvinder [3 ]
Verjans, Johan W. [1 ,2 ,3 ]
Carneiro, Gustavo [1 ]
机构
[1] Univ Adelaide, Australian Inst Machine Learning, Adelaide, SA, Australia
[2] Univ Adelaide, Fac Hlth & Med Sci, Adelaide, SA, Australia
[3] South Australian Hlth & Med Res Inst, Adelaide, SA, Australia
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
EVENT DETECTION; LOCALIZATION;
D O I
10.1109/ICCV48922.2021.00493
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly detection with weakly supervised video-level labels is typically formulated as a multiple instance learning (MIL) problem, in which we aim to identify snippets containing abnormal events, with each video represented as a bag of video snippets. Although current methods show effective detection performance, their recognition of the positive instances, i.e., rare abnormal snippets in the abnormal videos, is largely biased by the dominant negative instances, especially when the abnormal events are subtle anomalies that exhibit only small differences compared with normal events. This issue is exacerbated in many methods that ignore important video temporal dependencies. To address this issue, we introduce a novel and theoretically sound method, named Robust Temporal Feature Magnitude learning (RTFM), which trains a feature magnitude learning function to effectively recognise the positive instances, substantially improving the robustness of the MIL approach to the negative instances from abnormal videos. RTFM also adapts dilated convolutions and self-attention mechanisms to capture long- and short-range temporal dependencies to learn the feature magnitude more faithfully. Extensive experiments show that the RTFM-enabled MIL model (i) outperforms several state-of-the-art methods by a large margin on four benchmark data sets (ShanghaiTech, UCF-Crime, XD-Violence and UCSD-Peds) and (ii) achieves significantly improved subtle anomaly discriminability and sample efficiency.
引用
收藏
页码:4955 / 4966
页数:12
相关论文
共 81 条
[1]  
Abati Davide, 2019, P IEEE CVF C COMP VI
[2]  
[Anonymous], 2015, ARXIV151001553
[3]  
[Anonymous], 2006, EUR C COMP VIS ECCV
[4]  
[Anonymous], 2016, CoRR. abs/1511.07122
[5]  
[Anonymous], 2018, ADV NEURAL INFORM PR
[6]  
[Anonymous], 2019, ARXIV190710211
[7]  
[Anonymous], 2020, P IEEE CVF C COMP VI, DOI DOI 10.1109/ICCWAMTIP51612.2020.9317476
[8]  
Basharat A, 2008, PROC CVPR IEEE, P1301
[9]  
Bergman L., 2020, INT C LEARN REPR
[10]  
Bergmann Paul, 2020, IEEECVF C COMPUTER V