Anomaly detection in surveillance videos using Transformer with margin learning

被引：0

作者：

Wang, Dicong ^{[1
,2
]}

Wu, Kaijun ^{[2
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300354, Peoples R China

[2] Lanzhou Jiaotong Univ, Sch Elect & Informat Engn, Lanzhou 730070, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Video anomaly detection; Transformer; Multi-instance learning; Continuity; NETWORK;

D O I：

10.1007/s00530-024-01443-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Weakly supervised video anomaly detection (WSVAD) constitutes a highly research-oriented and challenging project within the domains of image and video processing. In prior studies of WSVAD, it has typically been formulated as a multiple-instance learning (MIL) problem. However, quite a few of these methods tend to primarily concentrate on time periods when anomalies occur discernibly. To recognize anomalous events, they rely solely on detecting significant changes in appearance or motion, ignoring the temporal completeness or continuity that anomalous events possess by nature. In addition, they also disregard the subtle correlations at the transitional boundaries between normal and abnormal states. Therefore, we propose a weakly supervised learning approach based on Transformer with margin learning for video anomaly detection. Specifically, our network effectively captures temporal changes around the occurrence of anomalies by utilizing the benefits of Transformer blocks, which are adept at capturing long-range dependencies in anomalous events. Secondly, to tackle challenging cases, i.e., normal events with high similarity to anomalous events, we employed a hard score memory. The purpose of this memory is to store the anomaly scores of hard samples, enabling iterative optimization training on those hard instances. Additionally, to bolster the discriminative capability of the model at the score level, we utilize pseudo-labels for anomalous events to provide supplementary support in detection. Experiments were conducted on two large-scale datasets, namely the ShanghaiTech dataset and the UCF-Crime dataset, and they achieved highly favorable results. The results of the experiments demonstrate that the proposed method is sensitive to anomalous events while performing competitively against state-of-the-art methods.

引用

页数：13

共 63 条

[1]

Amatriain X, 2024, Arxiv, DOI [arXiv:2302.07730, 10.48550/arXiv.2302.07730]

[2]

Brown TB, 2020, ADV NEUR IN, V33

[3] Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos [J].

Cao, Congqi ;

Zhang, Xin ;

Zhang, Shizhou ;

Wang, Peng ;

Zhang, Yanning .

IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :2497-2501

[4]

Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13

[5] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[6]

Chen CW, 2022, AAAI CONF ARTIF INTE, P230

[7]

Chen YX, 2023, AAAI CONF ARTIF INTE, P387

[8] Look Around for Anomalies: Weakly-supervised Anomaly Detection via Context-Motion Relational Learning [J].

Cho, MyeongAh ;

Kim, Minjung ;

Hwang, Sangwon ;

Park, Chaewon ;

Lee, Kyungjae ;

Lee, Sangyoun .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :12137-12146

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

← 1 2 3 4 5 6 7 →