Human-Scene Network: A novel baseline with self-rectifying loss for weakly supervised video anomaly detection

被引:3
作者
Majhi, Snehashis [1 ]
Dai, Rui [1 ]
Kong, Quan [2 ]
Garattoni, Lorenzo [3 ]
Francesca, Gianpiero [3 ]
Bremond, Francois [1 ]
机构
[1] INRIA, 2004 Rte Lucioles, Valbonne, France
[2] Woven Planet Holdings, 3-2-1 Nihonbashimuromachi,Chuo Ku, Tokyo, Japan
[3] Toyota Motor Europe, 60 Av Bourget, Brussels, Belgium
关键词
Video anomaly detection; Weakly-supervised learning; ABNORMAL EVENT DETECTION;
D O I
10.1016/j.cviu.2024.103955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video anomaly detection in surveillance systems with only video -level labels (i.e. weakly supervised) is challenging. This is due to (i) the complex integration of a large variety of scenarios including human and scene -based anomalies characterized by subtle or sharp spatio-temporal cues in real -world videos and (ii) non -optimal optimization between normal and anomaly instances under weak supervision. In this paper, we propose a Human -Scene Network to learn discriminative representations by capturing both subtle and strong cues in a dissociative manner. In addition, a self -rectifying loss is proposed that dynamically computes the pseudo -temporal annotations from video -level labels for optimizing the Human -Scene Network effectively. The proposed Human -Scene Network optimized with self -rectifying loss is validated on three publicly available datasets i.e. UCF-Crime, ShanghaiTech, and IITB-Corridor, outperforming recently reported state-of-the-art approaches on five out of the six scenarios considered.
引用
收藏
页数:11
相关论文
共 50 条
[1]   Robust real-time unusual event detection using multiple fixed-location monitors [J].
Adam, Amit ;
Rivlin, Ehud ;
Shimshoni, Ilan ;
Reinitz, David .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (03) :555-560
[2]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[3]  
Chen Weiling, 2023, P IEEECVF C COMPUTER, P5548
[4]  
Chen YX, 2023, AAAI CONF ARTIF INTE, P387
[5]   Learning a similarity metric discriminatively, with application to face verification [J].
Chopra, S ;
Hadsell, R ;
LeCun, Y .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :539-546
[6]   Abnormal event detection in crowded scenes using sparse representation [J].
Cong, Yang ;
Yuan, Junsong ;
Liu, Ji .
PATTERN RECOGNITION, 2013, 46 (07) :1851-1864
[7]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[8]  
Fan Y., 2024, IEEE Transactions on Circuits and Systems for Video Technology
[9]   MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection [J].
Feng, Jia-Chang ;
Hong, Fa-Ting ;
Zheng, Wei-Shi .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :14004-14013
[10]   Learning Temporal Regularity in Video Sequences [J].
Hasan, Mahmudul ;
Choi, Jonghyun ;
Neumann, Jan ;
Roy-Chowdhury, Amit K. ;
Davis, Larry S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :733-742