InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

被引:220
作者
Chi, Hyung-gun [1 ]
Ha, Myoung Hoon [2 ]
Chi, Seunggeun [1 ]
Lee, Sang Wan [2 ]
Huang, Qixing [3 ]
Ramani, Karthik [1 ,4 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Univ Texas Austin, Austin, TX 78712 USA
[4] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGCN(1) surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA.
引用
收藏
页码:20154 / 20164
页数:11
相关论文
共 61 条
[11]  
Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714
[12]  
Dziugaite GK, 2015, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P258
[13]   The Conditional Entropy Bottleneck [J].
Fischer, Ian .
ENTROPY, 2020, 22 (09)
[14]  
Gretton A., 2006, J MACH LEARN RES, P1
[15]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90
[16]  
Hoffman Matthew D, 2016, WORKSH ADV APPR BAY, V1, P3
[17]  
Hwang HyeongJoo, 2020, ADV NEURAL INFORM PR, V33, P3
[18]  
Kay W., 2017, The Kinetics Human Action Video Dataset
[19]  
Kim H, 2018, PR MACH LEARN RES, V80
[20]  
Kingma D.P., 2013, 2 INT C LEARN REPR I, DOI [DOI 10.61603/CEAS.V2I1.33, 10.1051/0004-6361/201527329, DOI 10.1051/0004-6361/201527329]