InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

被引:183
作者
Chi, Hyung-gun [1 ]
Ha, Myoung Hoon [2 ]
Chi, Seunggeun [1 ]
Lee, Sang Wan [2 ]
Huang, Qixing [3 ]
Ramani, Karthik [1 ,4 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Univ Texas Austin, Austin, TX 78712 USA
[4] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGCN(1) surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA.
引用
收藏
页码:20154 / 20164
页数:11
相关论文
共 61 条
  • [11] Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714
  • [12] Dziugaite GK, 2015, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P258
  • [13] The Conditional Entropy Bottleneck
    Fischer, Ian
    [J]. ENTROPY, 2020, 22 (09)
  • [14] Gretton A., 2006, J MACH LEARN RES, P1
  • [15] He K., 2016, 2016 IEEE C COMP VIS, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]
  • [16] Hoffman M. D., 2016, WORKSH ADV APPR BAYE
  • [17] Hwang HyeongJoo, 2020, ADV NEURAL INFORM PR, V33, P3
  • [18] Kay Will, 2017, The kinetics human action video dataset
  • [19] Kim H, 2018, PR MACH LEARN RES, V80
  • [20] King DB, 2015, ACS SYM SER, V1214, P1