Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition

被引:14
作者
Malik, Najeeb ur Rehman [1 ]
Abu-Bakar, Syed Abdul Rahman [1 ]
Sheikh, Usman Ullah [1 ]
Channa, Asma [2 ]
Popescu, Nirvana [2 ]
机构
[1] Univ Teknol Malaysia, Comp Vis Video & Image Proc Lab, ECE Dept, Johor Baharu 81310, Malaysia
[2] Univ Politehn Bucuresti, Comp Sci Dept, Bucharest 060042, Romania
来源
SIGNALS | 2023年 / 4卷 / 01期
基金
欧盟地平线“2020”;
关键词
human action recognition (HAR); deep learning; CNN-LSTM; REPRESENTATION;
D O I
10.3390/signals4010002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
引用
收藏
页码:40 / 55
页数:16
相关论文
共 50 条
  • [31] An efficient human action recognition framework with pose-based spatiotemporal features
    Agahian, Saeid
    Negin, Farhood
    Kose, Cemal
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2020, 23 (01): : 196 - 203
  • [32] Unified CNN-LSTM for keyhole status prediction in PAW based on spatial-temporal features
    Zhou, Fangzheng
    Liu, Xinfeng
    Jia, Chuanbao
    Li, Sen
    Tian, Jie
    Zhou, Weilu
    Wu, Chuansong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [33] Research on Recognition of Train Load and Local Health Status of Bridge Deck System Based on CNN-LSTM Deep Learning
    Piao C.
    Ji M.
    Zhang Z.
    Liu Y.
    Li Z.
    Dong X.
    Tiedao Xuebao/Journal of the China Railway Society, 2022, 44 (08): : 135 - 145
  • [34] Automatic Diagnosis of Schizophrenia in EEG Signals Using Functional Connectivity Features and CNN-LSTM Model
    Shoeibi, Afshin
    Rezaei, Mitra
    Ghassemi, Navid
    Namadchian, Zahra
    Zare, Assef
    Gorriz, Juan M.
    ARTIFICIAL INTELLIGENCE IN NEUROSCIENCE: AFFECTIVE ANALYSIS AND HEALTH APPLICATIONS, PT I, 2022, 13258 : 63 - 73
  • [35] A new CNN-LSTM architecture for activity recognition employing wearable motion sensor data: Enabling diverse feature extraction
    Kosar, Enes
    Barshan, Billur
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [36] Pose-Invariant Kinematic Features for Action Recognition
    Ramanathan, Manoj
    Yau, Wei-Yun
    Khwang, Eam Teoh
    Thalmann, Nadia Magnenat
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 292 - 299
  • [37] Correlational Convolutional LSTM for human action recognition
    Majd, Mahshid
    Safabakhsh, Reza
    NEUROCOMPUTING, 2020, 396 : 224 - 229
  • [38] Non-diacritized Arabic speech recognition based on CNN-LSTM and attention-based models
    Alsayadi, Hamzah A.
    Abdelhamid, Abdelaziz A.
    Hegazy, Islam
    Fayed, Zaki T.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 6207 - 6219
  • [39] Scene Recognition Model in Underground Mines Based on CNN-LSTM and Spatial-Temporal Attention Mechanism
    Zheng, Tianwei
    Liu, Chi
    Liu, Beizhan
    Wang, Mei
    Li, Yuancheng
    Wang, Pai
    Qin, Xuebin
    Guo, Yuan
    2020 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2020), 2021, : 513 - 516
  • [40] FPGA-accelerated hybrid CNN-LSTM system for efficient EEG-based drowsiness recognition
    Yanamala, Rama Muni Reddy
    Pullakandam, Muralidhar
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (03)