Weapon Violence Dataset 2.0: A synthetic dataset for violence detection

被引:1
作者
Nadeem, Muhammad Shahroz [1 ]
Kurugollu, Fatih [2 ]
Atlam, Hany F. [3 ]
Franqueira, Virginia N. L. [4 ]
机构
[1] Univ Suffolk, Sch Technol Business & Arts, Ipswich IP4 1QJ, England
[2] Univ Sharjah, Dept Comp Sci, Coll Comp & Informat, Sharjah 27272, U Arab Emirates
[3] Univ Warwick, Cyber secur Ctr, Warwick Mfg Grp WMG, Coventry CV4 7AL, England
[4] Univ Kent, Sch Comp, Canterbury CT2 7NZ, England
关键词
Synthetic virtual violence; WVD; Violence detection; GTA-V; Hot and Cold weapons;
D O I
10.1016/j.dib.2024.110448
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the current era, satisfying the appetite of data hungry models is becoming an increasingly challenging task. This challenge is particularly magnified in research areas characterised by sensitivity, where the quest for genuine data proves to be elusive. The study of violence serves as a poignant example, entailing ethical considerations and compounded by the scarcity of authentic, real -world data that is predominantly accessible only to law enforcement agencies. Existing datasets in this field often resort to using content from movies or open -source video platforms like YouTube, further emphasising the scarcity of authentic data. To address this, our dataset aims to pioneer a new approach by creating the first synthetic virtual dataset for violence detection, named the Weapon Violence Dataset (WVD). The dataset is generated by creating virtual violence scenarios inside the photo -realistic video game namely: Grand Theft Auto -V (GTA-V). This dataset includes carefully selected video clips of person -to -person fights captured from a frontal view, featuring various weapons-both hot and cold across different times of the day. Specifically, WVD contains three cate gories: Hot violence and Cold violence (representing the violence category) as well as No violence (constituting the control class). The dataset is designed and created in a way that will enable the research community to train deep models on such synthetic data with the ability to increase the data corpus if the needs arise. The dataset is publicly available on Kaggle and comprises normal RGB and optic flow videos. (c) 2024 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )
引用
收藏
页数:10
相关论文
共 7 条
[1]   Two-frame motion estimation based on polynomial expansion [J].
Farnebäck, G .
IMAGE ANALYSIS, PROCEEDINGS, 2003, 2749 :363-370
[2]  
Jedijosh920, 2017, GTA5
[3]   Advances, challenges and opportunities in creating data for trustworthy AI [J].
Liang, Weixin ;
Tadesse, Girmaw Abebe ;
Ho, Daniel ;
Li, Fei-Fei ;
Zaharia, Matei ;
Zhang, Ce ;
Zou, James .
NATURE MACHINE INTELLIGENCE, 2022, 4 (08) :669-677
[4]   Deep labeller: automatic bounding box generation for synthetic violence detection datasets [J].
Nadeem, Muhammad Shahroz ;
Kurugollu, Fatih ;
Saravi, Sara ;
Atlam, Hany F. ;
Franqueira, Virginia N. L. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) :10717-10734
[5]  
Nikolenko S., 2021, Synthetic Data for Deep Learning, VVolume 174
[6]   Playing for Data: Ground Truth from Computer Games [J].
Richter, Stephan R. ;
Vineet, Vibhav ;
Roth, Stefan ;
Koltun, Vladlen .
COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :102-118
[7]   Synthetic dataset generation for object-to-model deep learning in industrial applications [J].
Wong, Matthew Z. ;
Kunii, Kiyohito ;
Baylis, Max ;
Ong, Wai Hong ;
Kroupa, Pavel ;
Koller, Swen .
PEERJ COMPUTER SCIENCE, 2019, 2019 (10)