VPE-WSVAD: Visual prompt exemplars for weakly-supervised video anomaly detection

被引：10

作者：

Su, Yong ^{[1
]}

Tan, Yuyu ^{[1
]}

Xing, Meng ^{[2
,3
]}

An, Simin ^{[1
]}

机构：

[1] Tianjin Normal Univ, Tianjin 300387, Peoples R China

[2] Tianjin Univ, Tianjin 300350, Peoples R China

[3] Queen Mary Univ London, London E1 4NS, England

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 299卷

关键词：

Video anomaly detection; Weakly-supervised; Visual prompt exemplars; Prompt retrieval network; Proposal filtering mechanism; Prompt likelihood learning; RECONSTRUCTION;

D O I：

10.1016/j.knosys.2024.111978

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly Supervised Video Anomaly Detection (WSVAD) plays a crucial role in visual surveillance by effectively distinguishing anomalies from normality with only video-level annotations; nevertheless, due to inherent limitations, including imbalanced data, intra-bag similarity , and snippet entanglement , existing methodologies prone to a bias towards normality. This paper proposes a novel framework, Visual Prompt Exemplars-based WSVAD termed VPE-WSVAD, to enhance discriminative representations of anomalies against normality incorporating scenario-awareness visual prompt exemplars. The proposed VPE-WSVAD framework comprises three key components. Firstly, the prompt retrieval network generates potential abnormal proposals consistent with visual prompt exemplars at the frame level. Secondly, the proposal filtering mechanism aims to select the proposal with the highest confidence score to represent potential anomalies in the snippet. Finally, prompt likelihood learning is designed to capture the correlations between proposals (w.r.t prompt exemplars) snippets, generating discriminative representations for each snippet. By incorporating visual prompt exemplars, our method provides more detailed reporting of abnormal events, including when, where, and what. To validate the efficacy of our method, we conducted comprehensive evaluations on three publicly available datasets. experimental results conclusively demonstrate the superiority of our approach, achieving an impressive framelevel area under the curve (AUC) of 96.88% on the ShanghaiTech dataset and 99.86% on the UCSD Ped2 dataset. Our code will be released in the future.

引用

页数：12

共 86 条

[1] UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection [J].