Action Capsules: Human skeleton action recognition

被引：13

作者：

Bavil, Ali Farajzadeh ^{[1
]}

Damirchi, Hamed ^{[1
]}

Taghirad, Hamid D. ^{[1
]}

机构：

[1] KN Toosi Univ Technol, Fac Elect Engn, Adv Robot & Automated Syst ARAS, Shariati Ave, Tehran 1631714191, Iran

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2023年 / 233卷

关键词：

Skeleton-based human action recognition; Capsule neural network; Action capsules; Personalized action capsules; Global action capsules;

D O I：

10.1016/j.cviu.2023.103722

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the compact and rich high-level representations offered, skeleton-based human action recognition has recently gained more attraction. Although joint relationships investigation in spatial and temporal dimensions provides effective information critical to action recognition, effectively encoding global dependencies of joints during spatio-temporal feature extraction is a prohibitive task. In this paper, we introduce Action Capsule which identifies action-related key joints by considering the latent correlation of joints in a skeleton sequence. We show that, during inference, our end-to-end network pays attention to a set of joints specific to each action, whose encoded spatio-temporal features are aggregated to recognize the action. Additionally, the use of multiple stages of action capsules enhances the ability of the network to classify similar actions. A comparative analysis of our capsule-based approach with other widely-used methods in skeleton action recognition is given, highlighting the advantages of the proposed approach in handling missing skeleton data by leveraging iterative processing. Consequently, our network outperforms the state-of-the-art approaches on the N-UCLA dataset and obtains competitive results on the NTURGBD dataset. This is while our approach has significantly lower computational requirements based on GFLOPs measurements.

引用

页数：11

共 46 条

[1] Optuna: A Next-generation Hyperparameter Optimization Framework [J].

Akiba, Takuya ;

Sano, Shotaro ;

Yanase, Toshihiko ;

Ohta, Takeru ;

Koyama, Masanori .

KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631

[2] Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition [J].

Bagautdinov, Timur ;

Alahi, Alexandre ;

Fleuret, Francois ;

Fua, Pascal ;

Savarese, Silvio .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3425-3434

[3] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].

Chen, Yuxin ;

Zhang, Ziqi ;

Yuan, Chunfeng ;

Li, Bing ;

Deng, Ying ;

Hu, Weiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348

[4] Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].

Cheng, Ke ;

Zhang, Yifan ;

He, Xiangyu ;

Chen, Weihan ;

Cheng, Jian ;

Lu, Hanqing .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189

[5] InfoGCN: Representation Learning for Human Skeleton-based Action Recognition [J].

Chi, Hyung-gun ;

Ha, Myoung Hoon ;

Chi, Seunggeun ;

Lee, Sang Wan ;

Huang, Qixing ;

Ramani, Karthik .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20154-20164

[6] PoTion: Pose MoTion Representation for Action Recognition [J].

Choutas, Vasileios ;

Weinzaepfel, Philippe ;

Revaud, Jerome ;

Schmid, Cordelia .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7024-7033

[7]

Damirchi Hamed, 2020, 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), P1382, DOI 10.1109/ICMLA51294.2020.00215

[8]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

[9] Revisiting Skeleton-based Action Recognition [J].

Duan, Haodong ;

Zhao, Yue ;

Chen, Kai ;

Lin, Dahua ;

Dai, Bo .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2959-2968

[10] SlowFast Networks for Video Recognition [J].

Feichtenhofer, Christoph ;

Fan, Haoqi ;

Malik, Jitendra ;

He, Kaiming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6201-6210

← 1 2 3 4 5 →