InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

被引：220

作者：

Chi, Hyung-gun ^{[1
]}

Ha, Myoung Hoon ^{[2
]}

Chi, Seunggeun ^{[1
]}

Lee, Sang Wan ^{[2
]}

Huang, Qixing ^{[3
]}

Ramani, Karthik ^{[1
,4
]}

机构：

[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA

[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[3] Univ Texas Austin, Austin, TX 78712 USA

[4] Purdue Univ, Sch Mech Engn, W Lafayette, IN 47907 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

美国国家科学基金会; 新加坡国家研究基金会;

关键词：

D O I：

10.1109/CVPR52688.2022.01955

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on encoding a skeleton, less attention has been paid to embed this information into the latent representations of human action. InfoGCN proposes a learning framework for action recognition combining a novel learning objective and an encoding method. First, we design an information bottleneck-based learning objective to guide the model to learn informative but compact latent representations. To provide discriminative information for classifying action, we introduce attention-based graph convolution that captures the context-dependent intrinsic topology of human action. In addition, we present a multi-modal representation of the skeleton using the relative position of joints, designed to provide complementary spatial information for joints. InfoGCN(1) surpasses the known state-of-the-art on multiple skeleton-based action recognition benchmarks with the accuracy of 93.0% on NTU RGB+D 60 cross-subject split, 89.8% on NTU RGB+D 120 cross-subject split, and 97.0% on NW-UCLA.

引用

页码：20154 / 20164

页数：11

共 61 条

[11]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

[12]

Dziugaite GK, 2015, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P258

[13] The Conditional Entropy Bottleneck [J].

Fischer, Ian .

ENTROPY, 2020, 22 (09)

[14]

Gretton A., 2006, J MACH LEARN RES, P1

[15]

HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90

[16]

Hoffman Matthew D, 2016, WORKSH ADV APPR BAY, V1, P3

[17]

Hwang HyeongJoo, 2020, ADV NEURAL INFORM PR, V33, P3

[18]

Kay W., 2017, The Kinetics Human Action Video Dataset

[19]

Kim H, 2018, PR MACH LEARN RES, V80

[20]

Kingma D.P., 2013, 2 INT C LEARN REPR I, DOI [DOI 10.61603/CEAS.V2I1.33, 10.1051/0004-6361/201527329, DOI 10.1051/0004-6361/201527329]

← 1 2 3 4 5 6 7 →