Motion-Guided Graph Convolutional Network for Human Action Recognition

被引：0

作者：

Li, Jingjing ^{[1
]}

Huang, Zhangjin ^{[1
,2
]}

Zou, Lu ^{[1
]}

机构：

[1] School of Data Science, University of Science and Technology of China, Hefei

[2] School of Computer Science and Technology, University of Science and Technology of China, Hefei

来源：

Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2024年 / 36卷 / 07期

关键词：

action recognition; graph convolution; human skeleton; motion-guided topology;

D O I：

10.3724/SP.J.1089.2024.19898

中图分类号：

学科分类号：

摘要：

The current skeleton-based human action recognition methods cannot model the changes in the dependence between joints over time, and the interaction of cross space-time information. To solve these problems, a novel motion-guided graph convolutional network (M-GCN) is proposed. Firstly, the high-level motion features are extracted from the skeleton sequence. Secondly, the predefined graphs and the learnable graphs are optimized by the motion-dependent correlations on the time dimension. And the different joint dependencies, i.e., the motion-guided topologies, are captured along the time dimension. Thirdly, the motion-guided topologies are used for spatial graph convolutions, and motion information is fused into spatial graph convolutions to realize the interaction of spatial-temporal information. Finally, spatial-temporal graph convolutions are applied alternately to implement precise human action recognition. Compared with the graph convolution method such as MS-G3D on the dataset NTU-RGB+D and the dataset NTU-RGB+D 120, the results show that the accuracy of the proposed method on the cross subject and cross view of NTU-RGB+D is improved to 92.3% and 96.7%, respectively, and the accuracy on the cross subject and cross setup of NTU-RGB+D 120 is improved to 88.8% and 90.2%, respectively. © 2024 Institute of Computing Technology. All rights reserved.

引用

页码：1077 / 1086

页数：9

共 29 条

[11]

Liu J, Shahroudy A, Perez M, Et al., NTU RGB+D 120: a large-scale benchmark for 3D human activity understanding, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 10, pp. 2684-2701, (2020)

[12]

Wang J, Liu Z C, Wu Y, Et al., Learning actionlet ensemble for 3D human action recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 5, pp. 914-927, (2014)

[13]

Vemulapalli R, Arrate F, Chellappa R., Human action recognition by representing 3D skeletons as points in a lie group, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 588-595, (2014)

[14]

Liu J, Wang G, Duan L Y, Et al., Skeleton-based human action recognition with global context-aware attention LSTM networks, IEEE Transactions on Image Processing, 27, 4, pp. 1586-1599, (2018)

[15]

Song S J, Lan C L, Xing J L, Et al., An end-to-end spatio-temporal attention model for human action recognition from skeleton data, Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 4263-4270, (2017)

[16]

Ke Q H, Bennamoun M, An S J, Et al., A new representation of skeleton sequences for 3D action recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4570-4579, (2017)

[17]

Yang S Y, Liu J, Lu S J, Et al., Skeleton cloud colorization for unsupervised 3D action representation learning, Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13403-13413, (2021)

[18]

Cheng K, Zhang Y F, He X Y, Et al., Skeleton-based action recognition with shift graph convolutional network, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 180-189, (2020)

[19]

Zhang P F, Lan C L, Zeng W J, Et al., Semantics-guided neural networks for efficient skeleton-based human action recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1109-1118, (2020)

[20]

Chen T L, Zhou D S, Wang J, Et al., Learning multi-granular spatio-temporal graph network for skeleton-based action recognition, Proceedings of the 29th ACM International Conference on Multimedia, pp. 4334-4342, (2021)

← 1 2 3 →