Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition

被引:14
作者
Malik, Najeeb ur Rehman [1 ]
Abu-Bakar, Syed Abdul Rahman [1 ]
Sheikh, Usman Ullah [1 ]
Channa, Asma [2 ]
Popescu, Nirvana [2 ]
机构
[1] Univ Teknol Malaysia, Comp Vis Video & Image Proc Lab, ECE Dept, Johor Baharu 81310, Malaysia
[2] Univ Politehn Bucuresti, Comp Sci Dept, Bucharest 060042, Romania
来源
SIGNALS | 2023年 / 4卷 / 01期
基金
欧盟地平线“2020”;
关键词
human action recognition (HAR); deep learning; CNN-LSTM; REPRESENTATION;
D O I
10.3390/signals4010002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [41] A Hybrid CNN-LSTM Network for the Classification of Human Activities Based on Micro-Doppler Radar
    Zhu, Jianping
    Chen, Haiquan
    Ye, Wenbin
    IEEE ACCESS, 2020, 8 : 24713 - 24720
  • [42] Human Action Recognition in Video Sequence using Logistic Regression by Features Fusion Approach based on CNN Features
    Ahmad, Tariq
    Wu, Jinsong
    Khan, Imran
    Rahim, Asif
    Khan, Amjad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 18 - 25
  • [43] Subject-Independent Drowsiness Recognition from Single-Channel EEG with an Interpretable CNN-LSTM model
    Cui, Jian
    Lan, Zirui
    Zheng, Tianhu
    Liu, Yisi
    Sourina, Olga
    Wang, Lipo
    Mueller-Wittig, Wolfgang
    2021 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW 2021), 2021, : 201 - 208
  • [44] Multi-channel CNN-LSTM based Power System Event Classification via Wavelet Image Features
    Kim D.-I.
    Transactions of the Korean Institute of Electrical Engineers, 2023, 72 (09) : 982 - 986
  • [45] A fused CNN-LSTM model using FFT with application to real-time power quality disturbances recognition
    Cen, Senfeng
    Kim, Dong Ok
    Lim, Chang Gyoon
    ENERGY SCIENCE & ENGINEERING, 2023, 11 (07) : 2267 - 2280
  • [46] An Efficient Ensemble Framework for Human Gait Recognition Using CNN-LSTM With Extra Tree Classifier and Smartphone Sensors in Real-World Environment
    Choudhury, Nurul Amin
    Singh, Sakshi
    Soni, Badal
    IEEE SENSORS LETTERS, 2024, 8 (09)
  • [47] Comparing CNN and Human Crafted Features for Human Activity Recognition
    Cruciani, Federico
    Vafeiadis, Anastasios
    Nugent, Chris
    Cleland, Ian
    McCullagh, Paul
    Votis, Konstantinos
    Giakoumis, Dimitrios
    Tzovaras, Dimitrios
    Chen, Liming
    Hamzaoui, Raouf
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 960 - 967
  • [48] 3D CNN for Human Action Recognition
    Boualia, Sameh Neili
    Ben Amara, Najoua Essoukri
    2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 276 - 282
  • [49] Multi Modal RGB D Action Recognition with CNN LSTM Ensemble Deep Network
    Srihari, D.
    Kishore, P. V. V.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (12) : 738 - 746
  • [50] Multiview 3D human pose estimation using improved least-squares and LSTM networks
    Carlos Nunez, Juan
    Cabido, Raid
    Velez, Jose F.
    Montemayor, Antonio S.
    Jose Pantrigo, Juan
    NEUROCOMPUTING, 2019, 323 : 335 - 343