ESTS-GCN: An Ensemble Spatial-Temporal Skeleton-Based Graph Convolutional Networks for Violence Detection

被引：0

作者：

Janbi, Nourah Fahad ^{[1
]}

Ghaseb, Musrea Abdo ^{[2
]}

Almazroi, Abdulwahab Ali ^{[1
]}

机构：

[1] Univ Jeddah, Coll Comp & Informat Technol Khulais, Dept Informat Technol, Jeddah, Saudi Arabia

[2] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS | 2024年 / 2024卷

关键词：

graph attention networks; graph convolutional networks; safe community; self-attention; skeleton; smart city; smart surveillance; violence detection;

D O I：

10.1155/2024/2323337

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Surveillance systems are essential for social and personal security. However, monitoring multiple video feeds with multiple targets is challenging for human operators. Therefore, automatic and smart surveillance systems have been introduced to support or replace traditional surveillance systems and build safer communities. Advancements in artificial intelligence techniques, particularly in the field of computer vision, have boosted this area of research. Most existing works have focused on image-based (RGB-based) machine learning and deep learning algorithms for detecting anomalous and violent events. In this study, we propose a unique Ensemble Spatial-Temporal Skeleton-Based Graph Convolutional Networks (ESTS-GCNs) model for violence detection that automatically uses spatial and temporal data to detect violence in surveillance videos. Skeleton-based algorithms are less sensitive to pixel-based noise and background interference, making them excellent candidates for activity and anomaly detection. Our proposed ensemble-based architecture utilizes Graph Convolutional Networks (GCNs) and comprises multiple spatial and temporal modules. Three different spatial pipelines are exploited: channel-wise topologies, self-attention mechanism, and graph attention networks. The models were trained and evaluated using two skeleton-based datasets introduced by us: Skeleton-based Real-Life Violence Situations (RLVS) and NTU-Violence (NTU-V). Our model achieved a maximum accuracy of around 93% and outperformed existing models by more than 10%.

引用

页数：19

共 55 条

[1] Abdali Almamon Rasool, 2022, 2022 7th International Conference on Image, Vision and Computing (ICIVC), P69, DOI 10.1109/ICIVC55077.2022.9886172
[2] Aldahoul N., 2021, 3 IEEE INT C ART INT
[3] Real-time video anomaly detection for smart surveillance
Ali, Manal Mostafa
[J]. IET IMAGE PROCESSING, 2023, 17 (05) : 1375 - 1388
[4] Improved Graph Convolutional Network with Enriched Graph Topology Representation for Skeleton-Based Action Recognition
Alsarhan, Tamam
Harfoushi, Osama
Shdefat, Ahmed Younes
Mostafa, Nour
Alshinwan, Mohammad
Ali, Ahmad
[J]. ELECTRONICS, 2023, 12 (04)
[5] [Anonymous], 2023, NTU-violence Dataset
[6] [Anonymous], 2023, Skeleton-Based RLVS Dataset
[7] [Anonymous], 2023, GitHub-CMU-Perceptual-Computing-Lab/openpose: OpenPose: Real-Time Multi-Person Keypoint Detection Library for Body, Face, Hands, and Foot Estimation
[8] [Anonymous], 2012, 2012 IEEE COMPUTER S, DOI [DOI 10.1109/CVPRW.2012.6239348, 10.1109/CVPRW.2012.6239348]
[9] [Anonymous], 2023, Pose Estimation-TensorFlow Lite
[10] [Anonymous], 2023, MoveNet: Ultra Fast and Accurate Pose Detection Model TensorFlow Hub

← 1 2 3 4 5 6 →