InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

被引:183
作者
Chi, Hyung-gun [1 ]
Ha, Myoung Hoon [2 ]
Chi, Seunggeun [1 ]
Lee, Sang Wan [2 ]
Huang, Qixing [3 ]
Ramani, Karthik [1 ,4 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Univ Texas Austin, Austin, TX 78712 USA
[4] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGCN(1) surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA.
引用
收藏
页码:20154 / 20164
页数:11
相关论文
共 61 条
  • [1] Agarap AF, 2018, ARXIV
  • [2] Alemi A. A., 2017, INT C LEARNING REPRE
  • [3] Ba Jimmy Lei, 2016, LAYER NORMALIZATION, DOI 10.48550/arXiv.1607.06450
  • [4] Belghazi MI, 2018, PR MACH LEARN RES, V80
  • [5] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [6] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [7] A NEW MULTICRITERIA DECISION MAKING APPROACH FOR UNIVERSITY RANKING: THE SKYLINE SIR METHOD
    Chai, Junyi
    Cheng, Ken
    Liu, Wenbin
    [J]. PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 27 - 32
  • [8] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
    Chen, Tailin
    Zhou, Desen
    Wang, Jian
    Wang, Shidong
    Guan, Yu
    He, Xuming
    Ding, Errui
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
  • [9] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
    Cheng, Ke
    Zhang, Yifan
    He, Xiangyu
    Chen, Weihan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189
  • [10] P-CNN: Pose-based CNN Features for Action Recognition
    Cheron, Guilhem
    Laptev, Ivan
    Schmid, Cordelia
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3218 - 3226