Anomaly Detection Based on a 3D Convolutional Neural Network Combining Convolutional Block Attention Module Using Merged Frames

被引:10
作者
Hwang, In-Chang [1 ]
Kang, Hyun-Soo [1 ]
机构
[1] Chungbuk Natl Univ, Sch Elect & Comp Engn, Dept Informat & Commun Engn, Cheongju 28644, South Korea
关键词
video surveillance; convolutional block attention module; area under the curve; equal error rate; computer vision; UBI-Fights; 3D convolution; VIOLENCE DETECTION; EVENT DETECTION; VIDEO;
D O I
10.3390/s23239616
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
With the recent rise in violent crime, the real-time situation analysis capabilities of the prevalent closed-circuit television have been employed for the deterrence and resolution of criminal activities. Anomaly detection can identify abnormal instances such as violence within the patterns of a specified dataset; however, it faces challenges in that the dataset for abnormal situations is smaller than that for normal situations. Herein, using datasets such as UBI-Fights, RWF-2000, and UCSD Ped1 and Ped2, anomaly detection was approached as a binary classification problem. Frames extracted from each video with annotation were reconstructed into a limited number of images of 3x3, 4x3, 4x4, 5x3 sizes using the method proposed in this paper, forming an input data structure similar to a light field and patch of vision transformer. The model was constructed by applying a convolutional block attention module that included channel and spatial attention modules to a residual neural network with depths of 10, 18, 34, and 50 in the form of a three-dimensional convolution. The proposed model performed better than existing models in detecting abnormal behavior such as violent acts in videos. For instance, with the undersampled UBI-Fights dataset, our network achieved an accuracy of 0.9933, a loss value of 0.0010, an area under the curve of 0.9973, and an equal error rate of 0.0027. These results may contribute significantly to solve real-world issues such as the detection of violent behavior in artificial intelligence systems using computer vision and real-time video monitoring.
引用
收藏
页数:24
相关论文
共 85 条
[1]   Violence Detection Enhancement by Involving Convolutional Block Attention Modules Into Various Deep Learning Architectures: Comprehensive Case Study for UBI-Fights Dataset [J].
Abbass, Mahmoud Abdelkader Bashery ;
Kang, Hyun-Soo .
IEEE ACCESS, 2023, 11 :37096-37107
[2]   Robust real-time unusual event detection using multiple fixed-location monitors [J].
Adam, Amit ;
Rivlin, Ehud ;
Shimshoni, Ilan ;
Reinitz, David .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (03) :555-560
[3]  
[Anonymous], 2006, Unusual crowd activity dataset of university of minnesota
[4]   ViViT: A Video Vision Transformer [J].
Arnab, Anurag ;
Dehghani, Mostafa ;
Heigold, Georg ;
Sun, Chen ;
Lucic, Mario ;
Schmid, Cordelia .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :6816-6826
[5]  
Nievas EB, 2011, LECT NOTES COMPUT SC, V6855, P332, DOI 10.1007/978-3-642-23678-5_39
[6]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[7]  
Chen YX, 2023, AAAI CONF ARTIF INTE, P387, DOI 10.1609/aaai.v37i1.25112
[8]   RWF-2000: An Open Large Scale Video Database for Violence Detection [J].
Cheng, Ming ;
Cai, Kunjing ;
Li, Ming .
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :4183-4190
[9]   Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder [J].
Chong, Yong Shean ;
Tay, Yong Haur .
ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 :189-196
[10]   Bus Violence: An Open Benchmark for Video Violence Detection on Public Transport [J].
Ciampi, Luca ;
Foszner, Pawel ;
Messina, Nicola ;
Staniszewski, Michal ;
Gennaro, Claudio ;
Falchi, Fabrizio ;
Serao, Gianluca ;
Cogiel, Michal ;
Golba, Dominik ;
Szczssna, Agnieszka ;
Amato, Giuseppe .
SENSORS, 2022, 22 (21)