SVM directed machine learning classifier for human action recognition network

被引:5
作者
Lamani, Dharmanna [1 ]
Kumar, Pramod [2 ]
Bhagyalakshmi, A. [3 ]
Shanthi, J. Maria [4 ]
Maguluri, Lakshmana Phaneendra [5 ]
Arif, Mohammad [6 ]
Dhanamjayulu, C. [7 ]
Kumar, K. Sathish [7 ]
Khan, Baseem [8 ,9 ]
机构
[1] Manipal Acad Higher Educ, Manipal Inst Technol Bengaluru, Dept Comp Sci & Engn, Manipal 560064, Karnataka, India
[2] Ganga Inst Technol & Management, Dept Comp Sci & Engn, Kablana 124001, Haryana, India
[3] Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci &, Dept Comp Sci & Engn, Chennai 600062, Tamil Nadu, India
[4] JB Inst Engn & Technol, Dept Artificial Intelligence & Machine Learning, Hyderabad 500075, Telangana, India
[5] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Guntur 522302, Andra Pradesh, India
[6] Alliance Univ, Dept Comp Sci & Engn, Bengaluru 562106, Karnataka, India
[7] Vellore Inst Technol, Sch Elect Engn, Vellore, India
[8] Hawassa Univ, Dept Elect & Comp Engn, Hawassa, Ethiopia
[9] Hawassa Univ, Ctr Renewable Energy & Microgrids, Zhuji 311816, Zhejiang, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Support vector machine (SVM); Spatial motion; Three-dimensional convolutional neural networks (3D CNN); Directed acyclic graphs; Human action recognition network (HARNet);
D O I
10.1038/s41598-024-83529-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Understanding human behavior and human action recognition are both essential components of effective surveillance video analysis for the purpose of guaranteeing public safety. However, existing approaches such as three-dimensional convolutional neural networks (3D CNN) and two-stream neural networks (2SNN) have computational hurdles due to the significant parameterization they require. In this paper, we offer HARNet, a specialized lightweight residual 3D CNN that is built on directed acyclic graphs and was created expressly to handle these issues and achieve effective human action detection. The suggested method presents an innovative pipeline for creating spatial motion data from raw video inputs, which makes successful latent representation learning of human motions easier to accomplish. This generated input is then supplied into HARNet, which processes spatial and motion information in a single stream in an effective manner, maximizing the benefits of both types of cues. The use of traditional machine learning classifiers is done in order to further improve the discriminative capacity of the features that have been learned. To be more specific, we use the latent representations that are stored in HARNet's fully connected layer and use them as our deep learnt features. After that, these features are entered into the Support Vector Machine (SVM) classifier in order to accomplish action recognition. In order to evaluate the HARNet-SVM method that was developed, empirical tests were run on commonly used action recognition datasets such as UCF101, HMDB51, and the KTH dataset. These tests were carried out in order to gather data for the evaluation. The experimental results show that our method is superior to other state-of-the-art approaches, achieving considerable performance increases of 2.75% on UCF101, 10.94% on HMDB51, and 0.18% on the KTH dataset. These results were obtained by running the method on each dataset separately. Our findings demonstrate the usefulness of HARNet's lightweight design and highlight the significance of utilizing SVM classifiers with deep learnt features for the purpose of accurate and computationally efficient human activity recognition in surveillance videos. This work helps to the advancement of surveillance technology, which in turn makes video analysis in applications that take place in the real world safer and more dependable.
引用
收藏
页数:18
相关论文
共 27 条
[1]  
Berroukham A., 2023, Bull. Electric. Eng. Inform., V12, P314, DOI DOI 10.11591/EEI.V12I1.3944
[2]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3]   Image Super-Resolution Using Deep Convolutional Networks [J].
Dong, Chao ;
Loy, Chen Change ;
He, Kaiming ;
Tang, Xiaoou .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) :295-307
[4]  
Dosovitskiy A., 2020, Adv. Neural Inform. Process. Syst. (NeurIPS), V33, P15572
[5]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929]
[6]   A combined multiple action recognition and summarization for surveillance video sequences [J].
Elharrouss, Omar ;
Almaadeed, Noor ;
Al-Maadeed, Somaya ;
Bouridane, Ahmed ;
Beghdadi, Azeddine .
APPLIED INTELLIGENCE, 2021, 51 (02) :690-712
[7]  
Goodfellow I., 2020, Generative Adversarial Networks: Overv. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI), V42, P944
[8]  
Han K, 2022, Arxiv, DOI arXiv:2012.12556
[9]   An Improved Stereo Matching Algorithm Based on Joint Similarity Measure and Adaptive Weights [J].
Lai, Xiangjun ;
Yang, Bo ;
Ma, Botao ;
Liu, Mingzhe ;
Yin, Zhengtong ;
Yin, Lirong ;
Zheng, Wenfeng .
APPLIED SCIENCES-BASEL, 2023, 13 (01)
[10]  
Liu Z, 2021, Arxiv, DOI arXiv:2103.14030