Weakly-supervised anomaly detection with a Sub-Max strategy

被引:2
作者
Zhang, Bohua [1 ]
Xue, Jianru [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
Video analysis; Anomaly detection; Weakly-supervised setting; Selective sampling; Multiple instance learning; Deep neural network; ABNORMAL EVENT DETECTION; BEHAVIOR DETECTION; LOCALIZATION;
D O I
10.1016/j.neucom.2023.126770
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study weakly-supervised anomaly detection where only video-level "anomalous"/"normal"labels are available in training, while anomaly events should be temporally localized in testing. For this task, a commonly used framework is multiple instance learning (MIL), where clip instances are sampled from individual videos to form video-level bags. This sampling process arguably is a bottleneck of MIL. If too many instances are sampled, we not only encounter high computational overheads but also have many noisy instances in the bag. On the other hand, when too few instances are used, e.g., through enlarged grids, much background noise may be included in the anomaly instances. To resolve this dilemma, we propose a simple yet effective method named Sub-Max. In partitioned image regions, it identifies instances that are most probable candidates for anomaly events by selecting cuboids that have high optical flow magnitudes. We show that our method effectively brings down the computational cost of the baseline MIL and at the same time significantly filters out the influence of noise. Albeit simple, this strategy is shown to facilitate the learning of discriminative features and thus improve event classification and localization performance. For example, after annotating the event location ground truths of the UCF-Crime test set, we report very competitive accuracy compared with the state of the art on both frame-level and pixel-level metrics, corresponding to classification and localization, respectively.
引用
收藏
页数:13
相关论文
共 104 条
  • [1] A Perceptual Prediction Framework for Self Supervised Event Segmentation
    Aakur, Sathyanarayanan N.
    Sarkar, Sudeep
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1197 - 1206
  • [2] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [3] Latent Space Autoregression for Novelty Detection
    Abati, Davide
    Porrello, Angelo
    Calderara, Simone
    Cucchiara, Rita
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 481 - 490
  • [4] Robust real-time unusual event detection using multiple fixed-location monitors
    Adam, Amit
    Rivlin, Ehud
    Shimshoni, Ilan
    Reinitz, David
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (03) : 555 - 560
  • [5] [Anonymous], 2003, Advances in Neural Information Processing Systems
  • [6] Antic B, 2011, IEEE I CONF COMP VIS, P2415, DOI 10.1109/ICCV.2011.6126525
  • [7] Bergeron Charles, 2008, INT C MACH LEARN
  • [8] Multi-scale and real-time non-parametric approach for anomaly detection and localization
    Bertini, Marco
    Del Bimbo, Alberto
    Seidenari, Lorenzo
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2012, 116 (03) : 320 - 329
  • [9] Bin Zhao, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3313, DOI 10.1109/CVPR.2011.5995524
  • [10] Real time face and object tracking as a component of a perceptual user interface
    Bradski, GR
    [J]. FOURTH IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION - WACV'98, PROCEEDINGS, 1998, : 214 - 219