Multimodal human action recognition based on spatio-temporal action representation recognition model

被引:0
|
作者
Qianhan Wu
Qian Huang
Xing Li
机构
[1] Hohai University,The Key Laboratory of Water Big Data Technology of Ministry of Water Resources
[2] Hohai University,School of Computer and Information
来源
关键词
Human action recognition; Multimode learning; HP-DMI; ST-GCN extractor; HTMCCA;
D O I
暂无
中图分类号
学科分类号
摘要
Human action recognition methods based on single-modal data lack adequate information. It is necessary to propose the methods based on multimodal data and the fusion algorithms to fuse different features. Meanwhile, the existing features extracted from depth videos and skeleton sequences are not representative. In this paper, we propose a new model named Spatio-temporal Action Representation Recognition Model for recognizing human actions. This model proposes a new depth feature map called Hierarchical Pyramid Depth Motion Images (HP-DMI) to represent depth videos and adopts Spatial-temporal Graph Convolutional Networks (ST-GCN) extractor to summarize skeleton features named Spatio-temporal Joint Descriptors (STJD). Histogram of Oriented Gradient (HOG) is used on HP-DMI to extract HP-DMI-HOG features. Then two kinds of features are input into a fusion algorithm High Trust Mean Canonical correlation analysis (HTMCCA). HTMCCA mitigates the impact of noisy samples on multi-feature fusion and reduces computational complexity. Finally, Support Vector Machine (SVM) is used for human action recognition. To evaluate the performance of our approach, several experiments are conducted on two public datasets. Eexperiments results prove its effectiveness.
引用
收藏
页码:16409 / 16430
页数:21
相关论文
共 50 条
  • [21] SKELETON ACTION RECOGNITION BASED ON SPATIO-TEMPORAL FEATURES
    Huang, Qian
    Xie, Mengting
    Li, Xing
    Wang, Shuaichen
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3284 - 3288
  • [22] Spatio-Temporal Pyramid Model Based on Depth Maps for Action Recognition
    Xu, Haining
    Chen, Enqing
    Liang, Chengwu
    Qi, Lin
    Guan, Ling
    2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
  • [23] Spatio-temporal stacking model for skeleton-based action recognition
    Zhong, Yufeng
    Yan, Qiuyan
    APPLIED INTELLIGENCE, 2022, 52 (11) : 12116 - 12130
  • [24] Spatio-temporal stacking model for skeleton-based action recognition
    Yufeng Zhong
    Qiuyan Yan
    Applied Intelligence, 2022, 52 : 12116 - 12130
  • [25] Spatio-temporal feature extraction and representation for RGB-D human action recognition
    Luo, Jiajia
    Wang, Wei
    Qi, Hairong
    PATTERN RECOGNITION LETTERS, 2014, 50 : 139 - 148
  • [26] Action recognition in realistic scenes via local spatio-temporal representation
    Lei, Qing
    Li, Shaozi
    Zhang, Hongbo
    Journal of Information and Computational Science, 2014, 11 (01): : 275 - 286
  • [27] Spatio-Temporal VLAD Encoding for Human Action Recognition in Videos
    Duta, Ionut C.
    Ionescu, Bogdan
    Aizawa, Kiyoharu
    Sebe, Nicu
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 365 - 378
  • [28] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    SYMMETRY-BASEL, 2023, 15 (12):
  • [29] Bag of Spatio-temporal Synonym Sets for Human Action Recognition
    Pang, Lin
    Cao, Juan
    Guo, Junbo
    Lin, Shouxun
    Song, Yan
    ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 2010, 5916 : 422 - 432
  • [30] Robust human action recognition based on spatio-temporal descriptors and motion temporal templates
    Dou, Jianfang
    Li, Jianxun
    OPTIK, 2014, 125 (07): : 1891 - 1896