InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

被引:226
作者
Chi, Hyung-gun [1 ]
Ha, Myoung Hoon [2 ]
Chi, Seunggeun [1 ]
Lee, Sang Wan [2 ]
Huang, Qixing [3 ]
Ramani, Karthik [1 ,4 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[3] Univ Texas Austin, Austin, TX 78712 USA
[4] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGCN(1) surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA.
引用
收藏
页码:20154 / 20164
页数:11
相关论文
共 61 条
[1]  
Agarap A.F., 2018, CoRR abs/1803.08375
[2]  
Alemi Alexander A, 2016, ICLR
[3]  
Ba J. L., 2016, Advances in Neural Information Processing Systems (NeurIPS), P1
[4]  
Belghazi MI, 2018, PR MACH LEARN RES, V80
[5]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[6]  
Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
[7]   A NEW MULTICRITERIA DECISION MAKING APPROACH FOR UNIVERSITY RANKING: THE SKYLINE SIR METHOD [J].
Chai, Junyi ;
Cheng, Ken ;
Liu, Wenbin .
PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, :27-32
[8]   Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [J].
Chen, Tailin ;
Zhou, Desen ;
Wang, Jian ;
Wang, Shidong ;
Guan, Yu ;
He, Xuming ;
Ding, Errui .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :4334-4342
[9]   Skeleton-Based Action Recognition with Shift Graph Convolutional Network [J].
Cheng, Ke ;
Zhang, Yifan ;
He, Xiangyu ;
Chen, Weihan ;
Cheng, Jian ;
Lu, Hanqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :180-189
[10]   P-CNN: Pose-based CNN Features for Action Recognition [J].
Cheron, Guilhem ;
Laptev, Ivan ;
Schmid, Cordelia .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3218-3226