Three-dimensional atrous inception module for crowd behavior classification

被引:1
作者
Choi, Jong-Hyeok [1 ,2 ]
Kim, Jeong-Hun [1 ]
Nasridinov, Aziz [1 ,3 ]
Kim, Yoo-Sung [4 ]
机构
[1] Chungbuk Natl Univ, Bigdata Res Inst, Cheongju 28644, South Korea
[2] AICON Co Co Ltd, Res Inst, Seoul 06774, South Korea
[3] Chungbuk Natl Univ, Dept Comp Sci, Cheongju 28644, South Korea
[4] Inha Univ, Dept Artificial Intelligence, Incheon 22212, South Korea
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
新加坡国家研究基金会;
关键词
D O I
10.1038/s41598-024-65003-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in deep learning have led to a surge in computer vision research, including the recognition and classification of human behavior in video data. However, most studies have focused on recognizing individual behaviors, whereas recognizing crowd behavior remains a complex problem because of the large number of interactions and similar behaviors among individuals or crowds in video surveillance systems. To solve this problem, we propose a three-dimensional atrous inception module (3D-AIM) network, which is a crowd behavior classification model that uses atrous convolution to explore interactions between individuals or crowds. The 3D-AIM network is a 3D convolutional neural network that can use receptive fields of various sizes to effectively identify specific features that determine crowd behavior. To further improve the accuracy of the 3D-AIM network, we introduced a new loss function called the separation loss function. This loss function focuses the 3D-AIM network more on the features that distinguish one type of crowd behavior from another, thereby enabling a more precise classification. Finally, we demonstrate that the proposed model outperforms existing human behavior classification models in terms of accurately classifying crowd behaviors. These results suggest that the 3D-AIM network with a separation loss function can be valuable for understanding complex crowd behavior in video surveillance systems.
引用
收藏
页数:15
相关论文
共 54 条
  • [1] Semantic Segmentation Based Crowd Tracking and Anomaly Detection via Neuro-fuzzy Classifier in Smart Surveillance System
    Abdullah, Faisal
    Jalal, Ahmad
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 2173 - 2190
  • [2] Agarap A.F., 2008, arXiv
  • [3] Anomalous event detection and localization in dense crowd scenes
    Alhothali, Areej
    Balabid, Amal
    Alharthi, Reem
    Alzahrani, Bander
    Alotaibi, Reem
    Barnawi, Ahmed
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (10) : 15673 - 15694
  • [4] [Anonymous], 1989, P ADV NEURAL INFORM, DOI DOI 10.5555/2969830
  • [5] ViViT: A Video Vision Transformer
    Arnab, Anurag
    Dehghani, Mostafa
    Heigold, Georg
    Sun, Chen
    Lucic, Mario
    Schmid, Cordelia
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6816 - 6826
  • [6] Ben-Baruch E., 2021, P IEEE CVF INT C COM
  • [7] Bendali-Braham M., 2021, P 12 INT S IM SIGN P
  • [8] Bendali-Braham M., 2019, P 11 INT S IM SIG PR
  • [9] Recent trends in crowd analysis: A review
    Bendali-Braham, Mounir
    Weber, Jonathan
    Forestier, Germain
    Idoumghar, Lhassane
    Muller, Pierre-Alain
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2021, 4
  • [10] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733