Elevating urban surveillance: A deep CCTV monitoring system for detection of anomalous events via human action recognition

被引:3
作者
Kim, Hyungmin [1 ,2 ]
Jeon, Hobeom [1 ,2 ]
Kim, Dohyung [1 ,2 ]
Kim, Jaehong [2 ]
机构
[1] Univ Sci & Technol UST, 217 Gajeong Ro, Daejeon 34113, South Korea
[2] Elect & Telecommun Res Inst ETRI, 218 Gajeong Ro, Daejeon 34129, South Korea
关键词
Social sustainability; Surveillance system; Abnormal action detection; Deep learning; HUMAN FALL DETECTION; OPTICAL-FLOW; VIOLENCE; CRIME; FEAR;
D O I
10.1016/j.scs.2024.105793
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In the face of urbanization and the widespread use of CCTV cameras, the processing of surveillance videos has gained importance. This study endeavors to create a city-wide monitoring system utilizing human action recognition that can elevate the social sustainability of citizens. The primary goal is to develop an entire framework to detect unusual events within urban environments, with a specific focus on identifying four aberrant actions: "falling," "violence," "loitering," and "intrusion.". The processing of CCTV images is vulnerable to adverse weather conditions, particularly impacting human detection and tracking when obstructions like body parts occlusion, such as during falling events. To address these challenges, the paper proposes tracking compensation techniques that boost the system's ability to detect anomalies without requiring additional training. The proposed approach demonstrates a remarkable 21.21% enhancement in detecting falling events, without compromising its handling of other event types. Overall, the system achieves an impressive average F1 score of 93% across diverse event categories. The system's effectiveness is thoroughly assessed through an extensive subway domain case study, shedding light on its robustness and adaptability for potential real-world deployment. This study also delves into transfer learning dynamics based on sample quantity and pre-training with relevant human-of-interest data.
引用
收藏
页数:17
相关论文
共 91 条
[81]   Temporal Segment Networks: Towards Good Practices for Deep Action Recognition [J].
Wang, Limin ;
Xiong, Yuanjun ;
Wang, Zhe ;
Qiao, Yu ;
Lin, Dahua ;
Tang, Xiaoou ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :20-36
[82]   Mimetics: Towards Understanding Human Actions Out of Context [J].
Weinzaepfel, Philippe ;
Rogez, Gregory .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (05) :1675-1690
[83]   Fear of crime, mobility and mental health in inner-city London, UK [J].
Whitley, R ;
Prince, M .
SOCIAL SCIENCE & MEDICINE, 2005, 61 (08) :1678-1688
[84]   Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification [J].
Xie, Saining ;
Sun, Chen ;
Huang, Jonathan ;
Tu, Zhuowen ;
Murphy, Kevin .
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :318-335
[85]  
Yoon L., 2024, Number of violent crime arrests in seoul south korea 2022, by type
[86]   BEST: Benchmark and Evaluation of Surveillance Task [J].
Zhang, Chongyang ;
Ni, Bingbing ;
Song, Li ;
Zhai, Guangtao ;
Yang, Xiaokang ;
Zhang, Wenjun .
COMPUTER VISION - ACCV 2016 WORKSHOPS, PT III, 2017, 10118 :393-407
[87]  
Zhang H., 2018, INT C LEARN REPR
[88]   Security and Privacy in Smart City Applications: Challenges and Solutions [J].
Zhang, Kuan ;
Ni, Jianbing ;
Yang, Kan ;
Liang, Xiaohui ;
Ren, Ju ;
Shen, Xuemin .
IEEE COMMUNICATIONS MAGAZINE, 2017, 55 (01) :122-129
[89]   Scalable Person Re-identification: A Benchmark [J].
Zheng, Liang ;
Shen, Liyue ;
Tian, Lu ;
Wang, Shengjin ;
Wang, Jingdong ;
Tian, Qi .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1116-1124
[90]   Temporal Relational Reasoning in Videos [J].
Zhou, Bolei ;
Andonian, Alex ;
Oliva, Aude ;
Torralba, Antonio .
COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 :831-846