Semantics-Assisted Training Graph Convolution Network for Skeleton-Based Action Recognition

被引：0

作者：

Hu, Huangshui ^{[1
]}

Cao, Yu ^{[1
]}

Fang, Yue ^{[1
]}

Meng, Zhiqiang ^{[1
]}

机构：

[1] Changchun Univ Technol, Coll Comp Sci & Engn, Changchun 130012, Peoples R China

来源：

SENSORS | 2025年 / 25卷 / 06期

关键词：

action recognition; semantic relationships; semantics-assisted training; feature fusion;

D O I：

10.3390/s25061841

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

The skeleton-based action recognition networks often focus on extracting features such as joints from samples, while neglecting the semantic relationships inherent in actions, which also contain valuable information. To address the lack of utilization of semantic information, this paper proposes a semantics-assisted training graph convolution network (SAT-GCN). By dividing the features outputted by the skeleton encoder into four parts and contrasting them with the text features generated by the text encoder, the obtained contrastive loss is used to guide the overall network training. This approach effectively improves recognition accuracy while reducing the number of model parameters. In addition, angle features are incorporated into the skeleton model input to aid in classifying similar actions. Finally, a multi-feature skeleton encoder is designed to separately extract features such as joints, bones, and angles. These extracted features are then integrated through feature fusion. The fused features are then passed through three graph convolution blocks before being fed into fully connected layers for classification. Extensive experiments were conducted on three large-scale datasets, NTU RGB + D 60, NTU RGB + D 120, and NW-UCLA to validate the performance of the proposed model. The results show that the SAT-GCN outperforms others in terms of both accuracy and number of parameters.

引用

页数：21

共 49 条

[1] Towards Sustainable Safe Driving: A Multimodal Fusion Method for Risk Level Recognition in Distracted Driving Status [J].

Chen, Huiqin ;

Liu, Hao ;

Chen, Hailong ;

Huang, Jing .

SUSTAINABILITY, 2023, 15 (12)

[2] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].

Chen, Yuxin ;

Zhang, Ziqi ;

Yuan, Chunfeng ;

Li, Bing ;

Deng, Ying ;

Hu, Weiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348

[3] Multi-scale spatial-temporal convolutional neural network for skeleton-based action recognition [J].

Cheng, Qin ;

Cheng, Jun ;

Ren, Ziliang ;

Zhang, Qieshi ;

Liu, Jianming .

PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) :1303-1315

[4]

Cho KYHY, 2014, Arxiv, DOI arXiv:1406.1078

[5]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

[6]

Hamilton WL, 2017, ADV NEUR IN, V30

[7] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[8]

Hochreiter Sepp, 1997, NEURAL COMPUT, V9, P1735, DOI [DOI 10.1162/NECO.1997.9.8.1735, 10.1162/neco.1997.9.8.1735]

[9] Spatial-temporal graph attention networks for skeleton-based action recognition [J].

Huang, Qingqing ;

Zhou, Fengyu ;

He, Jiakai ;

Zhao, Yang ;

Qin, Runze .

JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (05)

[10]

Jiang Mingkun, 2022, 2022 International Symposium on Control Engineering and Robotics (ISCER), P208, DOI 10.1109/ISCER55570.2022.00042

← 1 2 3 4 5 →