Multi-Dimensional Dynamic Topology Learning Graph Convolution for Skeleton-Based Action Recognition

被引:0
|
作者
Luo H.-L. [1 ]
Cao L.-J. [1 ]
机构
[1] School of Information Engineering, Jiangxi University of Science and Technology, Jiangxi, Ganzhou
来源
基金
中国国家自然科学基金;
关键词
action recognition; data fusion; deep learning; dynamic skeleton topology; graph convolution;
D O I
10.12263/DZXB.20221106
中图分类号
学科分类号
摘要
Graph convolution is widely used in skeleton-based action recognition because of its effectiveness of processing graph data. However, the existing graph convolution methods use the shared graph topology for feature aggregation on all frames or channels, which greatly limits the representation ability of graph convolution network. In order to solve these problems, a multi-dimensional dynamic topology learning graph convolution is proposed in this paper to dynamically model the topology with temporal and channel specificity. The multi-dimensional dynamic topology learning graph convolution mainly includes three parts: pure joint topology learning graph convolution (J-GC), dynamic temporal-wise topology learning graph convolution (DTW-GC) and channel-wise topology learning graph convolution (CW-GC). In particular, in DTW-GC, a dynamic skeleton topology modeling method (DSTL) is designed to efficiently model the dynamic skeleton topology with rich global spatio-temporal topological features. Finally, by combining multi-dimensional dynamic topology learning graph convolution with multi-scale temporal convolution (Muti-Scale TCN), a graph convolution network with powerful modeling capability is constructed in this paper. In addition, in order to supplement the spatial information of skeleton data, the relative joint data and relative bone data are introduced for multi-stream network fusion. Our method achieves 92.64% and 89.29% accuracy on NTU-RGB+D and NTU-RGB+D 120 datasets, respectively, which is superior to the current state-of-the-art methods. © 2024 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:991 / 1001
页数:10
相关论文
共 19 条
  • [1] CAO Z, SIMON T, WEI S E, Et al., Realtime multi-person 2D pose estimation using part affinity fields, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302-1310, (2017)
  • [2] YAN S J, XIONG Y J, LIN D H., Spatial temporal graph convolutional networks for skeleton-based action recognition, Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7444-7452, (2018)
  • [3] ZHAO J N, SHE Q S, MENG M, Et al., Skeleton action recognition based on multi-stream spatial attention graph convolutional SRU network, Acta Electronica Sinica, 50, 7, pp. 1579-1585, (2022)
  • [4] SHI L, ZHANG Y F, CHENG J, Et al., Two-stream adaptive graph convolutional networks for skeleton-based action recognition, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12018-12027, (2020)
  • [5] SHI L, ZHANG Y F, CHENG J, Et al., Skeleton-based action recognition with multi-stream adaptive graph convolutional networks, IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 29, pp. 9532-9545, (2020)
  • [6] YE F F, PU S L, ZHONG Q Y, Et al., Dynamic GCN: Context-enriched topology learning for skeleton-based action recognition, Proceedings of the 28th ACM International Conference on Multimedia, pp. 55-63, (2020)
  • [7] WEN Y H, GAO L, FU H B, Et al., Motif-GCNs with local and non-local temporal blocks for skeleton-based action recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 2, pp. 2009-2023, (2023)
  • [8] LI M S, CHEN S H, CHEN X, Et al., Actional-structural graph convolutional networks for skeleton-based action recognition, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3590-3598, (2020)
  • [9] LIU Z Y, ZHANG H W, CHEN Z H, Et al., Disentangling and unifying graph convolutions for skeleton-based action recognition, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 140-149, (2020)
  • [10] CHI H G, HA M H, CHI S, Et al., InfoGCN: Representation learning for human skeleton-based action recognition, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 20154-20164, (2022)