Spatio-Temporal Dynamic Attention Graph Convolutional Network Based on Skeleton Gesture Recognition

被引：4

作者：

Han, Xiaowei ^{[1
,2
,3
,4
]}

Cui, Ying ^{[1
,4
]}

Chen, Xingyu ^{[1
,4
]}

Lu, Yunjing ^{[1
,4
]}

Hu, Wen ^{[1
,2
,3
,4
]}

机构：

[1] Harbin Univ Commerce, Sch Comp & Informat Engn, Harbin 150028, Peoples R China

[2] Postdoctoral Res Workstat Northeast Asia Serv Outs, Harbin 150028, Peoples R China

[3] Postdoctoral Flow Stn Appl Econ, Harbin 150028, Peoples R China

[4] Heilongjiang Prov Key Lab Elect Commerce & Informa, Harbin 150028, Peoples R China

来源：

ELECTRONICS | 2024年 / 13卷 / 18期

关键词：

dynamic hand gesture recognition; deep learning; graph convolutional network; channel attention; hand skeleton points;

D O I：

10.3390/electronics13183733

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic gesture recognition based on skeletal data has garnered significant attention with the rise of graph convolutional networks (GCNs). Existing methods typically calculate dependencies between joints and utilize spatio-temporal attention features. However, they often rely on joint topological features of limited spatial extent and short-time features, making it challenging to extract intra-frame spatial features and long-term inter-frame temporal features. To address this, we propose a new GCN architecture for dynamic hand gesture recognition, called a spatio-temporal dynamic attention graph convolutional network (STDA-GCN). This model employs dynamic attention spatial graph convolution, enhancing spatial feature extraction capabilities while reducing computational complexity through improved cross-channel information interaction. Additionally, a salient location channel attention mechanism is integrated between spatio-temporal convolutions to extract useful spatial features and avoid redundancy. Finally, dynamic multi-scale temporal convolution is used to extract richer inter-frame gesture features, effectively capturing information across various time scales. Evaluations on the SHREC'17 Track and DHG-14/28 benchmark datasets show that our model achieves 97.14% and 95.84% accuracy, respectively. These results demonstrate the superior performance of STDA-GCN in dynamic gesture recognition tasks.

引用

页数：18

共 32 条

[1]

Bruna J, 2014, Arxiv, DOI arXiv:1312.6203

[2] MFA-Net: Motion Feature Augmented Network for Dynamic Hand Gesture Recognition from Skeletal Data [J].

Chen, Xinghao ;

Wang, Guijin ;

Guo, Hengkai ;

Zhang, Cairong ;

Wang, Hang ;

Zhang, Li .

SENSORS, 2019, 19 (02)

[3] Memory Enhanced Global-Local Aggregation for Video Object Detection [J].

Chen, Yihong ;

Cao, Yue ;

Hu, Han ;

Wang, Liwei .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10334-10343

[4] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].

Chen, Yuxin ;

Zhang, Ziqi ;

Yuan, Chunfeng ;

Li, Bing ;

Deng, Ying ;

Hu, Weiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348

[5] Survey on 3D Hand Gesture Recognition [J].

Cheng, Hong ;

Yang, Lu ;

Liu, Zicheng .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (09) :1659-1673

[6]

Dabwan B.A., 2020, Int. J. Adv. Sci. Technol, V29, P4621

[7] Skeleton-based Dynamic hand gesture recognition [J].

De Smedt, Quentin ;

Wannous, Hazem ;

Vandeborre, Jean-Philippe .

PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, :1206-1214

[8]

De Smedt Quentin., 2017, P 3DOR 10 EUR WORKSH, DOI DOI 10.2312/3DOR.20171049

[9] Deep Learning for Hand Gesture Recognition on Skeletal Data [J].

Devineau, Guillaume ;

Xi, Wang ;

Moutarde, Fabien ;

Yang, Jie .

PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :106-113

[10]

Guo HK, 2017, Arxiv, DOI arXiv:1707.07248

← 1 2 3 4 →