Multi-stream Global-Local Motion Fusion Network for skeleton-based action recognition

被引：0

作者：

Qi, Yanpeng ^{[1
]}

Pang, Chen ^{[1
]}

Liu, Yiliang ^{[1
,3
]}

Lyu, Lei ^{[1
,2
]}

机构：

[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan, Peoples R China

[2] Shandong Prov Key Lab Distributed Comp Software No, Jinan, Peoples R China

[3] Shandong Prov Acad Educ Recruitment & Examinat, Jinan, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2023年 / 145卷

关键词：

Action recognition; Grouping graph convolution; Spatial-temporal self-attention; Multi-stream fusion strategy; LSTM;

D O I：

10.1016/j.asoc.2023.110536

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Skeleton-based action recognition is widely used in varied areas such as human-machine interaction and virtual reality. Benefit from the powerful expression ability to depict structural data, graph convolutional networks (GCNs) have been developed to address this task by modeling the human body skeletons as spatial-temporal graphs. However, most existing GCN-based methods usually ignore the diversity of the motion information between channels of the input feature. And how to enhance the ability to capture the long-term global correlations in spatial and temporal dimensions is also a fundamental challenge. In this work, we propose a novel multi-stream framework Global-Local Motion Fusion Network (GLMFN), which integrates the global and local motion information of spatial-temporal dimensions. Specifically, we design a grouping graph convolution module to enforce the ability to aggregate local spatial motion information. Besides, to learn richer semantic features, we propose two modules based on the self-attention operator: a spatial self-attention module and a temporal self-attention module. The former is responsible for extracting spatial long-term motion relationships, while the latter aims to capture temporal long-term motion relationships. Moreover, we present a multi-stream fusion strategy with a series of treatments for body joints to achieve a better recognition effect. To validate the efficacy and efficiency of the proposed model, we perform exhaustive experiments on the NTU-RGBD dataset and NTU-RGBD-120 dataset, and our method achieves the state-of-the-art performance on both datasets. (c) 2023 Published by Elsevier B.V.

引用

页数：13

共 74 条

[1] Human Activity Analysis: A Review
Aggarwal, J. K.
Ryoo, M. S.
[J]. ACM COMPUTING SURVEYS, 2011, 43 (03)
[2] Attention Augmented Convolutional Networks
Bello, Irwan
Zoph, Barret
Vaswani, Ashish
Shlens, Jonathon
Le, Quoc V.
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3285 - 3294
[3] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
[4] Large-Scale Machine Learning with Stochastic Gradient Descent
Bottou, Leon
[J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
[5] Bruna J, 2014, Arxiv, DOI arXiv:1312.6203
[6] Skeleton-Based Action Recognition With Gated Convolutional Neural Networks
Cao, Congqi
Lan, Cuiling
Zhang, Yifan
Zeng, Wenjun
Lu, Hanqing
Zhang, Yanning
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) : 3247 - 3257
[7] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Cao, Zhe
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
[8] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[9] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
Chen, Tailin
Zhou, Desen
Wang, Jian
Wang, Shidong
Guan, Yu
He, Xuming
Ding, Errui
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
[10] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
Cheng, Ke
Zhang, Yifan
He, Xiangyu
Chen, Weihan
Cheng, Jian
Lu, Hanqing
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189

← 1 2 3 4 5 6 7 8 →